Re: RFI: Database URIs
On Fri, Nov 22, 2013 at 05:13:53PM -0800, David E. Wheeler wrote: DBI Folks Gisle, I want to add support for specifying database connections as URIs to Sqitch, my DB change management system. I started working on it today, following the examples of JDBC and PostgreSQL. Before I release, though, I’d like a bit of feedback on a couple of things. First, I'm using the database name as the scheme in the URL. Some examples: postgresql://user@localhost:5433/dbname sqlite:///path/to/foo.db This is to make it easy to tell one DB from another. But I'm wondering if I should follow the example of JDBC more closely, and prepend db: or something to them. More like DBI DSNs, too. However, it would require a bit more work, as URI.pm does not currently recognize multiple colon-delimited strings as scheme names AFAICT. :-( Why not define a direct translation from a URL to a DBI DSN? A translation that doesn't require knowledge of any driver-specifics. Tim.
Re: RFI: Database URIs
On Nov 25, 2013, at 11:02 PM, Jens Rehsack rehs...@gmail.com wrote: db:csv?f_dir=dataf_dir_search=foof_dir_search=barf_ext=.csv/rf_lock=2f_encoding=utf8csv_eol=%0D%0Acsv_sep_char=,csv_quote_char=%22csv_escape_char=%22csv+class=Text::CSV_XScsv_null=1RaiseError=1PrintError=1FetchHashKeyName=NAME_lccsv_tables=%7B%22info%22%3A%20%7B%22f_file%22%3A%22info_csv%22%7D%7D Happy hacking, when you want type that on command line. DBD::CSV (and other Pure-Perl drivers) is designed for flexibility and quick setup, not for expensive configuration and long-term running. Actually, I take that back. I am only focused on having a URI format for the DSN, not the attribute hash --- at least for now. So I would revise that URI to just dbi:csv:. Unless SQLEngine was modified to support additional parameters in the DSN, all the other stuff would have to be passed via the attribute hash. You can’t - this is a clear point in DBI::DBD::SqlEngine at the moment. Is that something that might change in the future? I tend to think of the DSN parts as being specific to a connection and specifiable by an end user --- such as the database name, server, username, password, etc. I tend to think of the attribute hash as for the developer, such as transaction and error handling. They are also more specific to the language -- I don't need a PrintError option in my Java app; it has some other API. All I say to you: if you want DBI supporting that - keep in mind the whole requirements. If you just want to do another DBI extension, with limited coverage and usage - feel free. I don't know if I want the DBI to support it or not. I would like to get feedback on how best to represent database connection information in URIs, and what I might be overlooking. I am not sure what to do about attribute hashes, and so am ignoring them for now. But this discussion orthogonal to my original questions, which were: * Should I use a hierarchical scheme like JDBC? I’m leaning heavily toward this now, just need a decent prefix, I'm thinking dbms, e.g., dbms:csv:foo“. You can’t - because there is no „foo“ database for DBD::CSV at the moment. Conceptual issue. In current state of development, those pure-perl DBMS simulating drivers have no persistent metadata as mysql and postgresql have. Well, that's just a matter of how you define that. Maybe foo is the name of the directory holding CSV files, and the file names are table names. I realize this is not how DBD::CSV currently works, just pointing out that these things are a matter of definition and implementation. I think you miss my point - I don’t say that you’re wrong, I say you lack a lot of requirements. When you don’t intend to support those pure-perl drivers, you’re fine and please go ahead. When you intend full DBI driver support, DBDI (https://github.com/timbunce/DBDI) could enlighten you. Following that concept would allow you a migration to p6 one fine day ;) I don't intend to support any drivers at all. I want to define a common URI standard for connecting to database engines without regard to any programming language or API at all. Best, David
Re: RFI: Database URIs
On Nov 25, 2013, at 11:21 PM, H.Merijn Brand h.m.br...@xs4all.nl wrote: As I always use 2. when writing scripts, mostly because I use quite a few useful attributes in the 4th argument already, it is the most logical place: easy to maintain, easy to read and easy to extend. Also very much specific to the DBI and DBDs, and a useful place for you as the developer to define how the DBI should interface with the database. One of the huge disadvantages of putting everything in the DSN is the you have to rethink if and when character like :, ;, , and ' should be escaped and how and that values like undef are pretty darn impossible (unless the driver has taken the trouble to support that). And as far as I can tell, the DBI does not currently support any escaping at all in the DSN. I created a database with a comma in its name, and was unable to figure out how to get the DBI to connect to it. The first method however has a big advantage when the script (or a one-liner) is completely driven by environment settings. my $dbh = DBI-connect ($ENV{DBI_DSN}); which happens to be the (underdocumented) default for connect and thus equivalent to my $dbh = DBI-connect; I dislike this, personally. As a developer, I *always* exercise a certain amount of control over what gets passed in the attribute hash. It would be a rare condition where I would want to let an end-user do it. A multi-database command-line tool like dbish would be one of those rare exceptions. Best, David
Re: RFI: Database URIs
On Nov 26, 2013, at 12:42 AM, Tim Bunce tim.bu...@pobox.com wrote: Why not define a direct translation from a URL to a DBI DSN? A translation that doesn't require knowledge of any driver-specifics. Because I want to the onus of connecting to the database to be on the developer, not the end-user. I personally don't want to have to look up the format of the URI for every database I connect to, and why the DBI DSN and JDBC URL formats annoy me. I would rather have one solid, standard way to handle most of the stuff needed for a connection string (host, port, username, password, database name) and then pass additional stuff in a standard way (a GET query string). So I have implemented a `dbi_dsn` method on the database URI classes I’ve written. The base class uses the most commonly-used format, which DBD::Pg, for one, relies on: $ perl-MURI -E 'say URI-new(db:pg://me:sec...@example.com/foo.db)-dbi_dsn' dbi:Pg:dbname=me:secret.com/foo.db And then in subclasses I overrode appropriate methods to change the format. Here’s Ingres, for example: $ perl -MURI -E 'say URI-new(db:ingres://me:sec...@example.com/foo.db)-dbi_dsn' dbi:Ingres:foo.db And here’s SQL Server: $ perl -Ilib -MURI -E 'say URI-new(db:sqlserver://me:sec...@example.com/foo.db)-dbi_dsn' dbi:ODBC:Driver={SQL Server};Server=me:secret.com;Database=foo.db Note that the URI is the same for all of these, except for the engine part of the scheme. Additional Engine-specific parameters can be passed in the GET string, and will simply be passed, through, e.g.,: $ perl -Ilib -MURI -E 'say URI-new(db:pg://me:sec...@example.com/foo.db??connect_timeout=10application_name=myapp)-dbi_dsn' dbi:Pg:host=me:secret.com;dbname=foo.db;?connect_timeout=10;application_name=myapp This puts the most of the onus on translating from the URL to the DBI DSN on the developer of the URI class (me in this case) rather than on the user. Best, David
Re: RFI: Database URIs
On Nov 26, 2013, at 10:02 AM, David E. Wheeler da...@justatheory.com wrote: $ perl-MURI -E 'say URI-new(db:pg://me:sec...@example.com/foo.db)-dbi_dsn' dbi:Pg:dbname=me:secret.com/foo.db Well, I can see I have a bug or two to work out. That should be: $ perl -MURI -Ilib -E 'say URI-new(db:pg://me:sec...@example.com/foo.db)-dbi_dsn' dbi:Pg:host=example.com;dbname=foo.db Thanks, David
Re: RFI: Database URIs
On Nov 26, 2013, at 10:53 AM, David E. Wheeler da...@justatheory.com wrote: Well, I can see I have a bug or two to work out. That should be: $ perl -MURI -Ilib -E 'say URI-new(db:pg://me:sec...@example.com/foo.db)-dbi_dsn' dbi:Pg:host=example.com;dbname=foo.db Oh silly me not escaping the @. Let’s try that again: $ perl -Ilib -MURI -E 'say URI-new(shift)-dbi_dsn' db:pg://me:sec...@example.com/foo_db dbi:Pg:host=example.com;dbname=foo_db $ perl -Ilib -MURI -E 'say URI-new(shift)-dbi_dsn' db:ingres://me:sec...@example.com/foo_db dbi:Ingres:foo_db $ perl -Ilib -MURI -E 'say URI-new(shift)-dbi_dsn' db:sqlserver://me:sec...@example.com/foo_db dbi:ODBC:Driver={SQL Server};Server=example.com;Database=foo_db Much saner. :-) Best, David
Re: RFI: Database URIs
On Nov 26, 2013, at 11:26 AM, Gisle Aas gi...@activestate.com wrote: I do find the db: prefix ugly. If you want users to see these strings I would think they find this prefix to be clutter too. Yeah. But I would thin, that if it *was* a standard, there would be just one scheme defined. That’s a guess, though. Maybe no one would care, and would not mind an unlimited number of new schemes (one for every db engine)? I don’t know. You seem to be alone in calling it pg:. For the other examples out there I see postgresql: or postgres:. Should all different ways be allowed and lead to the same thing? I've seen both mysql: and mysql2:. What about mariadb:? Yeah, I was going to add postgres and postgresql subclasses. Maybe pgsql, too, I dunno. Never seen mysql2; what’s that? Maria is the same as MySQL, really; I should add it, though, and have it inherit from db:mysql, just as vertica currently inherits from db:pg. For sqlite:-URLs people seem to disagree on how to specify relative vs absolute path names. Really? To me, an absolute path starts with /. Anything else is relative. I wish there actually was _a_ standard for this stuff. :-) I’m planning to write it up as a blog post and talk to some other communities about it. Thanks, David
Re: RFI: Database URIs
To me the value of database urls would be compatibility with other implementations of this obvious idea. Some examples I found by quick googling: http://docs.sqlalchemy.org/en/rel_0_9/core/engines.html#database-urls http://docs.stackato.com/3.0/user/services/data-services.html#database-url https://github.com/kennethreitz/dj-database-url https://github.com/glenngillen/rails-database-url http://www.jguru.com/faq/view.jsp?EID=690 http://www.postgresql.org/docs/9.3/static/libpq-connect.html#LIBPQ-CONNSTRING It would be useful for somebody to survey these and then provide a perl module that's compatible with the consensus among these. It would also be a good place to document what the state of database urls currently is. --Gisle On Tue, Nov 26, 2013 at 9:42 AM, Tim Bunce tim.bu...@pobox.com wrote: On Fri, Nov 22, 2013 at 05:13:53PM -0800, David E. Wheeler wrote: DBI Folks Gisle, I want to add support for specifying database connections as URIs to Sqitch, my DB change management system. I started working on it today, following the examples of JDBC and PostgreSQL. Before I release, though, I’d like a bit of feedback on a couple of things. First, I'm using the database name as the scheme in the URL. Some examples: postgresql://user@localhost:5433/dbname sqlite:///path/to/foo.db This is to make it easy to tell one DB from another. But I'm wondering if I should follow the example of JDBC more closely, and prepend db: or something to them. More like DBI DSNs, too. However, it would require a bit more work, as URI.pm does not currently recognize multiple colon-delimited strings as scheme names AFAICT. :-( Why not define a direct translation from a URL to a DBI DSN? A translation that doesn't require knowledge of any driver-specifics. Tim.
Re: RFI: Database URIs
I do find the db: prefix ugly. If you want users to see these strings I would think they find this prefix to be clutter too. You seem to be alone in calling it pg:. For the other examples out there I see postgresql: or postgres:. Should all different ways be allowed and lead to the same thing? I've seen both mysql: and mysql2:. What about mariadb:? For sqlite:-URLs people seem to disagree on how to specify relative vs absolute path names. I wish there actually was _a_ standard for this stuff. :-) --Gisle On Tue, Nov 26, 2013 at 7:14 PM, David E. Wheeler da...@justatheory.comwrote: On Nov 26, 2013, at 9:32 AM, Gisle Aas gi...@activestate.com wrote: To me the value of database urls would be compatibility with other implementations of this obvious idea. Some examples I found by quick googling: http://docs.sqlalchemy.org/en/rel_0_9/core/engines.html#database-urls http://docs.stackato.com/3.0/user/services/data-services.html#database-url https://github.com/kennethreitz/dj-database-url https://github.com/glenngillen/rails-database-url http://www.jguru.com/faq/view.jsp?EID=690 http://www.postgresql.org/docs/9.3/static/libpq-connect.html#LIBPQ-CONNSTRING It would be useful for somebody to survey these and then provide a perl module that's compatible with the consensus among these. It would also be a good place to document what the state of database urls currently is. Thanks for those links They are all very similar (except for the rails one, which lacks docs). And they are similar to what I've come up with, detailed here: https://github.com/theory/uri-db/blob/master/lib/URI/db.pm Basically, a database format is actually two URIs. The first is an opaque URI, db:(.+). The second URL is inside the (.+), and can take two forms: $engine://$user:$pass@$host:$port$dbname?$params $engine:$dbname?$params Some examples from the test suite: db: db:pg: db:pg://localhost db:pg://localhost:33 db:pg://foo:123/try?foo=1foo=2lol=yes db:sqlite: db:sqlite:foo.db db:sqlite:/path/foo.db db:sqlite:///path/foo.db db:cubrid://localhost:33/foo db:db2://localhost:33/foo db:firebird://localhost:33/foo db:informix:foo.db db:informix:foo.db?foo=1 db:ingres:foo.db db:ingres:foo.db?foo=1 db:interbase://localhost:33/foo db:maxdb://localhost:33/foo db:maxdb://localhost/foo db:monetdb://localhost:1222?foo=1 db:monetdb://localhost/lolz db:mysql://localhost:33/foo db:oracle://localhost:33/foo db:sqlserver://localhost:33/foo db:sybase://localhost:33/foo db:teradata://localhost db:teradata://localhost:33/foo?hi=1 db:unify:foo.db db:unify: db:unify:?foo=1bar=2 db:vertica: db:vertica://localhost db:vertica://localhost:33 db:vertica://foo:123/try?foo=1foo=2lol=yes I think it’s pretty simple and familiar, and plan to put it (or something similar, based on this and other discussions) in Sqitch. Thoughts? David
Re: RFI: Database URIs
On Nov 26, 2013, at 11:49 AM, Gisle Aas gi...@activestate.com wrote: There is also precedence for using + in scheme names. Something like db+postgresql: then. It's still cluttered, and not really compatible with what other have used. Or x-postgres: while it's still experimental. Naming stuff is hard. Oh, I thought URI didn’t like +, but it turns out it is okay with it. $ perl -MURI -E 'say URI-new(x-ssh+foo://example.com)-scheme' x-ssh+foo Cool. Downside: Each DB engine requires its own scheme, which would make formal registration rather a PITA. I do see some registered “hierarchical” schemes using a dot, though, e.g., iris.beep, iris.xpc, etc.: http://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml Surely someone has done this before, though. :-( Yeah, naming is hard. Probably why it’s one of my favorite things to do. :-) I don't know. Version 2 of the mysql protocol perhaps. I saw it at https://github.com/kennethreitz/dj-database-url/blob/master/dj_database_url.py#L17 Well, if it’s a real thing, I’m happy to add it. The scheme really should just be named after the protocol, not the kind of product you happen to find at the other end. But scheme != protocol. [Wikipedia](http://en.wikipedia.org/wiki/URI_scheme) says: URI schemes are frequently and incorrectly referred to as protocols, or specifically as URI protocols or URL protocols, since most were originally designed to be used with a particular protocol, and often have the same name. The http scheme, for instance, is generally used for interacting with web resources using HyperText Transfer Protocol. Today, URIs with that scheme are also used for other purposes, such as RDF resource identifiers and XML namespaces, that are not related to the protocol. Furthermore, some URI schemes are not associated with any specific protocol (e.g. file) and many others do not use the name of a protocol as their prefix (e.g. news). To me it makes sense not to tie it to a particular protocol. I want to connect to a database, and don’t much care about the protocol. The PostgreSQL libpq URL starts with postgresql://, not libpq://. Is sqlite:///path relative or absolute then? What about sqlite:/path? Both are absolute. sqlite://path and sqlite:path are both relative. This is how my implementation handles them. dj-database-url claims 4 slashes is the way to go; sqlite:path Wow, is that ugly. I was relying on the standard for file: URLs as the precedent. The ftp: standard had the same problem. It was once specified that ftp://server/%2Fpath was to be required to make the path absolute, while ftp://server/path was supposed to be relative to the home directory you ended up in after logging in to the ftp account. This was very confusing to users so most browers just implemented the path to always be absolute, with no way to access relative paths. That's at least how I remembered it. %2f is ugly. Of course, there is no host name in file: URIs. If we add it, as would be needed for Firebird, for example, I can see why dj-database-url ended up with four slashes: This is a full path: db:firebird://example.com//path/to/db So I guess these should be equivalent: db:firebird:path/to/db db:firebird:/path/to/db The first has an empty string for the network location, the second has no network location part at all. I’ve just committed a fix to support this. https://github.com/theory/uri-db/commit/cd60c7ac7e02572e5db3b39d2acb08b4a7fcfefe Best, David
Re: RFI: Database URIs
On Tue, Nov 26, 2013 at 08:49:43PM +0100, Gisle Aas wrote: The scheme really should just be named after the protocol, not the kind of product you happen to find at the other end. It would certainly be preferable if there was a single name for each protocol. ODBC complicates that further. Tim.
Re: RFI: Database URIs
On Tue, Nov 26, 2013 at 8:32 PM, David E. Wheeler da...@justatheory.comwrote: On Nov 26, 2013, at 11:26 AM, Gisle Aas gi...@activestate.com wrote: I do find the db: prefix ugly. If you want users to see these strings I would think they find this prefix to be clutter too. Yeah. But I would thin, that if it *was* a standard, there would be just one scheme defined. That’s a guess, though. Maybe no one would care, and would not mind an unlimited number of new schemes (one for every db engine)? I don’t know. There is also precedence for using + in scheme names. Something like db+postgresql: then. It's still cluttered, and not really compatible with what other have used. Or x-postgres: while it's still experimental. Naming stuff is hard. You seem to be alone in calling it pg:. For the other examples out there I see postgresql: or postgres:. Should all different ways be allowed and lead to the same thing? I've seen both mysql: and mysql2:. What about mariadb:? Yeah, I was going to add postgres and postgresql subclasses. Maybe pgsql, too, I dunno. Never seen mysql2; what’s that? I don't know. Version 2 of the mysql protocol perhaps. I saw it at https://github.com/kennethreitz/dj-database-url/blob/master/dj_database_url.py#L17 Maria is the same as MySQL, really; I should add it, though, and have it inherit from db:mysql, just as vertica currently inherits from db:pg. The scheme really should just be named after the protocol, not the kind of product you happen to find at the other end. It would certainly be preferable if there was a single name for each protocol. For sqlite:-URLs people seem to disagree on how to specify relative vs absolute path names. Really? To me, an absolute path starts with /. Anything else is relative. Is sqlite:///path relative or absolute then? What about sqlite:/path? dj-database-url claims 4 slashes is the way to go; sqlite:path The ftp: standard had the same problem. It was once specified that ftp://server/%2Fpath was to be required to make the path absolute, while ftp://server/path was supposed to be relative to the home directory you ended up in after logging in to the ftp account. This was very confusing to users so most browers just implemented the path to always be absolute, with no way to access relative paths. That's at least how I remembered it. --Gisle
Re: RFI: Database URIs
On Nov 26, 2013, at 12:43 PM, Tim Bunce tim.bu...@pobox.com wrote: ODBC complicates that further. Indeed. I want to avoid the protocol. I've now written up my proposal as a blog post: http://theory.so/rfc/2013/11/26/toward-a-database-uri-standard/ Thanks, David
Re: RFI: Database URIs
On Nov 25, 2013, at 3:50 AM, Jens Rehsack rehs...@gmail.com wrote: DBI-connect($dsn, $user, $passwd, \%attr) 4th argument is wasted in your current proposal. Er, well, I failed to provide a complete set of examples. Here’s one from the PostgreSQL docs: postgresql://other@localhost/otherdb?connect_timeout=10application_name=myapp All the attributes are provided by the GET query string. Best, David
Re: RFI: Database URIs
Am 25.11.2013 um 18:00 schrieb David E. Wheeler da...@justatheory.com: On Nov 25, 2013, at 3:50 AM, Jens Rehsack rehs...@gmail.com wrote: DBI-connect($dsn, $user, $passwd, \%attr) 4th argument is wasted in your current proposal. Er, well, I failed to provide a complete set of examples. Here’s one from the PostgreSQL docs: postgresql://other@localhost/otherdb?connect_timeout=10application_name=myapp All the attributes are provided by the GET query string. Let’s go - shoot: # specify most possible flags via driver flags $dbh = DBI-connect (dbi:CSV:, undef, undef, { f_schema = undef, f_dir= data, f_dir_search = [], f_ext= .csv/r, f_lock = 2, f_encoding = utf8, csv_eol = \r\n, csv_sep_char = ,, csv_quote_char = '', csv_escape_char = '', csv_class= Text::CSV_XS, csv_null = 1, csv_tables = { info = { f_file = info.csv } }, RaiseError = 1, PrintError = 1, FetchHashKeyName = NAME_lc, }) or die $DBI::errstr; And keep in mind, csv_tables can be more complex and there’re other attributes like it eg. in DBD::AnyData. Cheers -- Jens Rehsack rehs...@gmail.com
Re: RFI: Database URIs
On Nov 25, 2013, at 11:08 AM, Jens Rehsack rehs...@gmail.com wrote: Let’s go - shoot: # specify most possible flags via driver flags $dbh = DBI-connect (dbi:CSV:, undef, undef, { f_schema = undef, f_dir= data, f_dir_search = [], f_ext= .csv/r, f_lock = 2, f_encoding = utf8, csv_eol = \r\n, csv_sep_char = ,, csv_quote_char = '', csv_escape_char = '', csv_class= Text::CSV_XS, csv_null = 1, csv_tables = { info = { f_file = info.csv } }, RaiseError = 1, PrintError = 1, FetchHashKeyName = NAME_lc, }) or die $DBI::errstr; And keep in mind, csv_tables can be more complex and there’re other attributes like it eg. in DBD::AnyData. Well, how you would want to handle params would be up to you. No, they are not great for hashes, but do-able. db:csv?f_dir=dataf_dir_search=foof_dir_search=barf_ext=.csv/rf_lock=2f_encoding=utf8csv_eol=%0D%0Acsv_sep_char=,csv_quote_char=%22csv_escape_char=%22csv+class=Text::CSV_XScsv_null=1RaiseError=1PrintError=1FetchHashKeyName=NAME_lccsv_tables=%7B%22info%22%3A%20%7B%22f_file%22%3A%22info_csv%22%7D%7D So yeah, one would need to do some sort of parsting of nested data (JSON in the csv_tables example here), though arrays work okay (e.g., f_dir_search). OTOH, can you specify all this stuff in a DSN parseable by parse_dsn(), either? But my point is not to replace the connect() API, but to create a standard of sorts for representing connection info in a URL, to make it easier to specify how to connect to things on the command-line. Yeah, if you want to do everything, it will require more work, but that would be up to the code that handles each driver, which is to say the URI subclass for a particular DBMS. But this discussion orthogonal to my original questions, which were: * Should I use a hierarchical scheme like JDBC? I’m leaning heavily toward this now, just need a decent prefix, I'm thinking dbms, e.g., dbms:csv:foo. * Do I have the metadata wrong for any of the DBMSes I have so far added support for? Right now that’s just the ports they listen on by default: https://github.com/theory/uri-db/blob/master/t/db.t#L9 Best, David
Re: RFI: Database URIs
Am 25.11.2013 um 20:42 schrieb David E. Wheeler da...@justatheory.com: On Nov 25, 2013, at 11:08 AM, Jens Rehsack rehs...@gmail.com wrote: Let’s go - shoot: # specify most possible flags via driver flags $dbh = DBI-connect (dbi:CSV:, undef, undef, { f_schema = undef, f_dir= data, f_dir_search = [], f_ext= .csv/r, f_lock = 2, f_encoding = utf8, csv_eol = \r\n, csv_sep_char = ,, csv_quote_char = '', csv_escape_char = '', csv_class= Text::CSV_XS, csv_null = 1, csv_tables = { info = { f_file = info.csv } }, RaiseError = 1, PrintError = 1, FetchHashKeyName = NAME_lc, }) or die $DBI::errstr; And keep in mind, csv_tables can be more complex and there’re other attributes like it eg. in DBD::AnyData. Well, how you would want to handle params would be up to you. No, they are not great for hashes, but do-able. db:csv?f_dir=dataf_dir_search=foof_dir_search=barf_ext=.csv/rf_lock=2f_encoding=utf8csv_eol=%0D%0Acsv_sep_char=,csv_quote_char=%22csv_escape_char=%22csv+class=Text::CSV_XScsv_null=1RaiseError=1PrintError=1FetchHashKeyName=NAME_lccsv_tables=%7B%22info%22%3A%20%7B%22f_file%22%3A%22info_csv%22%7D%7D Happy hacking, when you want type that on command line. DBD::CSV (and other Pure-Perl drivers) is designed for flexibility and quick setup, not for expensive configuration and long-term running. So yeah, one would need to do some sort of parsting of nested data (JSON in the csv_tables example here), though arrays work okay (e.g., f_dir_search). OTOH, can you specify all this stuff in a DSN parseable by parse_dsn(), either? You can’t - this is a clear point in DBI::DBD::SqlEngine at the moment. But my point is not to replace the connect() API, but to create a standard of sorts for representing connection info in a URL, to make it easier to specify how to connect to things on the command-line. Yeah, if you want to do everything, it will require more work, but that would be up to the code that handles each driver, which is to say the URI subclass for a particular DBMS. All I say to you: if you want DBI supporting that - keep in mind the whole requirements. If you just want to do another DBI extension, with limited coverage and usage - feel free. But this discussion orthogonal to my original questions, which were: * Should I use a hierarchical scheme like JDBC? I’m leaning heavily toward this now, just need a decent prefix, I'm thinking dbms, e.g., dbms:csv:foo“. You can’t - because there is no „foo“ database for DBD::CSV at the moment. Conceptual issue. In current state of development, those pure-perl DBMS simulating drivers have no persistent metadata as mysql and postgresql have. * Do I have the metadata wrong for any of the DBMSes I have so far added support for? Right now that’s just the ports they listen on by default: https://github.com/theory/uri-db/blob/master/t/db.t#L9 I think you miss my point - I don’t say that you’re wrong, I say you lack a lot of requirements. When you don’t intend to support those pure-perl drivers, you’re fine and please go ahead. When you intend full DBI driver support, DBDI (https://github.com/timbunce/DBDI) could enlighten you. Following that concept would allow you a migration to p6 one fine day ;) Cheers -- Jens Rehsack rehs...@gmail.com
Re: RFI: Database URIs
On Tue, 26 Nov 2013 08:02:45 +0100, Jens Rehsack rehs...@gmail.com wrote: Am 25.11.2013 um 20:42 schrieb David E. Wheeler da...@justatheory.com: On Nov 25, 2013, at 11:08 AM, Jens Rehsack rehs...@gmail.com wrote: Let’s go - shoot: # specify most possible flags via driver flags $dbh = DBI-connect (dbi:CSV:, undef, undef, { f_schema = undef, f_dir= data, f_dir_search = [], f_ext= .csv/r, f_lock = 2, f_encoding = utf8, csv_eol = \r\n, csv_sep_char = ,, csv_quote_char = '', csv_escape_char = '', csv_class= Text::CSV_XS, csv_null = 1, csv_tables = { info = { f_file = info.csv } }, RaiseError = 1, PrintError = 1, FetchHashKeyName = NAME_lc, }) or die $DBI::errstr; And keep in mind, csv_tables can be more complex and there’re other attributes like it eg. in DBD::AnyData. Well, how you would want to handle params would be up to you. No, they are not great for hashes, but do-able. db:csv?f_dir=dataf_dir_search=foof_dir_search=barf_ext=.csv/rf_lock=2f_encoding=utf8csv_eol=%0D%0Acsv_sep_char=,csv_quote_char=%22csv_escape_char=%22csv+class=Text::CSV_XScsv_null=1RaiseError=1PrintError=1FetchHashKeyName=NAME_lccsv_tables=%7B%22info%22%3A%20%7B%22f_file%22%3A%22info_csv%22%7D%7D Happy hacking, when you want type that on command line. DBD::CSV (and other Pure-Perl drivers) is designed for flexibility and quick setup, not for expensive configuration and long-term running. So yeah, one would need to do some sort of parsting of nested data (JSON in the csv_tables example here), though arrays work okay (e.g., f_dir_search). OTOH, can you specify all this stuff in a DSN parseable by parse_dsn(), either? You can’t - this is a clear point in DBI::DBD::SqlEngine at the moment. But my point is not to replace the connect() API, but to create a standard of sorts for representing connection info in a URL, to make it easier to specify how to connect to things on the command-line. Yeah, if you want to do everything, it will require more work, but that would be up to the code that handles each driver, which is to say the URI subclass for a particular DBMS. All I say to you: if you want DBI supporting that - keep in mind the whole requirements. If you just want to do another DBI extension, with limited coverage and usage - feel free. But this discussion orthogonal to my original questions, which were: * Should I use a hierarchical scheme like JDBC? I’m leaning heavily toward this now, just need a decent prefix, I'm thinking dbms, e.g., dbms:csv:foo“. You can’t - because there is no „foo“ database for DBD::CSV at the moment. Conceptual issue. In current state of development, those pure-perl DBMS simulating drivers have no persistent metadata as mysql and postgresql have. * Do I have the metadata wrong for any of the DBMSes I have so far added support for? Right now that’s just the ports they listen on by default: https://github.com/theory/uri-db/blob/master/t/db.t#L9 I think you miss my point - I don’t say that you’re wrong, I say you lack a lot of requirements. When you don’t intend to support those pure-perl drivers, you’re fine and please go ahead. When you intend full DBI driver support, DBDI (https://github.com/timbunce/DBDI) could enlighten you. Following that concept would allow you a migration to p6 one fine day ;) You are both right ATM there are two ways to define the DSN: 1. Using DSN my $dbh = DBI-connect (dbi:Pg:dbname=foo;…); 2. Using driver + attributes my $dbh = DBI-connect (dbi:Pg:, $u, $p, { dbname = foo, … }); As I always use 2. when writing scripts, mostly because I use quite a few useful attributes in the 4th argument already, it is the most logical place: easy to maintain, easy to read and easy to extend. One of the huge disadvantages of putting everything in the DSN is the you have to rethink if and when character like :, ;, , and ' should be escaped and how and that values like undef are pretty darn impossible (unless the driver has taken the trouble to support that). The first method however has a big advantage when the script (or a one-liner) is completely driven by environment settings. my $dbh = DBI-connect ($ENV{DBI_DSN}); which happens to be the (underdocumented) default for connect and thus equivalent to my $dbh = DBI-connect; I don't know if having yet another way to define the DSN is going to add a lot of trouble or a lot of pleasure. What I *do* know is that if this will catch on, the DSN parts in DBI - including the docs - will be better tested :) -- H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/ using perl5.00307 .. 5.19 porting perl5 on HP-UX, AIX, and
Re: RFI: Database URIs
On Nov 22, 2013, at 8:48 PM, Darren Duncan dar...@darrenduncan.net wrote: postgresql://user@localhost:5433/dbname sqlite:///path/to/foo.db By database name do you mean DBMS name? Because I'd say the database name is what's on the right-hand side of the //, not what's on the left. Yes, correct. Another thing I was going to say is, if you wanted some standardization, you should distinguish the parts that are necessary to connect to a database from parts that just select a default schema in the database for interacting with. Getting a bit OT from my original questions here. But anyway, I’m following the JDBC and PostgreSQL examples here, where, frankly, are almost identical to most other URLs. This is a very good thing. By that I mean, remember that a PostgreSQL database and a MySQL database aren't actually the same concept. A PostgreSQL DBMS server gives access to multiple disjoint databases where you must name one to connect, and then separately from that is the optional concept of the current schema that you can select. A MySQL DBMS server gives access to exactly 1 database, which you can connect to without specifying a database name, and selecting a current schema (what they call a database) is optional for using MySQL. While I agree that a PostgresSQL database and a MySQL database are not the same thing, the MySQL community still calls it a database and tends to think of it that way, and JDBC has set the precedent for relying on what the community calls a “database” to be the “database” part of the URL: http://dev.mysql.com/doc/refman/5.0/en/connector-j-reference-configuration-properties.html The reason I say this is that DBI's uri scheme uses the same syntax for both database even though how they're treated is actually very different. And you don't have to do the same. I don’t have to, but that ship has long since sailed, so I will treat them the same as the DBI and JDBC. Best, David
RFI: Database URIs
DBI Folks Gisle, I want to add support for specifying database connections as URIs to Sqitch, my DB change management system. I started working on it today, following the examples of JDBC and PostgreSQL. Before I release, though, I’d like a bit of feedback on a couple of things. First, I'm using the database name as the scheme in the URL. Some examples: postgresql://user@localhost:5433/dbname sqlite:///path/to/foo.db This is to make it easy to tell one DB from another. But I'm wondering if I should follow the example of JDBC more closely, and prepend db: or something to them. More like DBI DSNs, too. However, it would require a bit more work, as URI.pm does not currently recognize multiple colon-delimited strings as scheme names AFAICT. :-( Next, I've added a bunch of URI subclasses for various database engines. I’m not to familiar with some of them, so if you see any listed here where the name (to be used in the scheme) and the default port is wrong, please let me know: https://github.com/theory/uri-db/blob/master/t/db.t#L9 Thanks! David PS: Is this something the DBI might want to use?
Re: RFI: Database URIs
On 2013.11.22 5:13 PM, David E. Wheeler wrote: DBI Folks Gisle, I want to add support for specifying database connections as URIs to Sqitch, my DB change management system. I started working on it today, following the examples of JDBC and PostgreSQL. Before I release, though, I’d like a bit of feedback on a couple of things. First, I'm using the database name as the scheme in the URL. Some examples: postgresql://user@localhost:5433/dbname sqlite:///path/to/foo.db This is to make it easy to tell one DB from another. But I'm wondering if I should follow the example of JDBC more closely, and prepend db: or something to them. More like DBI DSNs, too. However, it would require a bit more work, as URI.pm does not currently recognize multiple colon-delimited strings as scheme names AFAICT. :-( Next, I've added a bunch of URI subclasses for various database engines. I’m not to familiar with some of them, so if you see any listed here where the name (to be used in the scheme) and the default port is wrong, please let me know: https://github.com/theory/uri-db/blob/master/t/db.t#L9 Thanks! David PS: Is this something the DBI might want to use?
Re: RFI: Database URIs
Sorry, I think I hit send without adding a reply message; here it is. On 2013.11.22 5:13 PM, David E. Wheeler wrote: First, I'm using the database name as the scheme in the URL. Some examples: postgresql://user@localhost:5433/dbname sqlite:///path/to/foo.db By database name do you mean DBMS name? Because I'd say the database name is what's on the right-hand side of the //, not what's on the left. Another thing I was going to say is, if you wanted some standardization, you should distinguish the parts that are necessary to connect to a database from parts that just select a default schema in the database for interacting with. By that I mean, remember that a PostgreSQL database and a MySQL database aren't actually the same concept. A PostgreSQL DBMS server gives access to multiple disjoint databases where you must name one to connect, and then separately from that is the optional concept of the current schema that you can select. A MySQL DBMS server gives access to exactly 1 database, which you can connect to without specifying a database name, and selecting a current schema (what they call a database) is optional for using MySQL. The reason I say this is that DBI's uri scheme uses the same syntax for both database even though how they're treated is actually very different. And you don't have to do the same. -- Darren Duncan