Re: RFC: SQL Extensions for SQL::Statement [Long]
On Thu, 5 Jun 2003 10:50:45 +0100, Tim Bunce wrote: [quoting from the root message:] SELECT $cols FROM tbl1@$dsn1 ... JOIN tbl2@$dsn2 ... I think we're very much in the realms of inventing our own syntax here. We're just using 'standard sql' as a guide for everyones sanity. FWIW, MySQL allows the syntax SELECT * FROM database.table where table is the name of a table in the database with name database. I'm not sure how standard that is :) Both databases, the one you connect to, and the one mentioned here, must (?) reside on the same MySQL server, though. It's the only way I expect it to behave sanely, anyway. -- Bart.
Re: RFC: SQL Extensions for SQL::Statement [Long]
On Wed, 04 Jun 2003 09:15:28 -0700, Jeff Zucker wrote: For per-table connections - CONNECT TO '$dsn1' AS tbl1 CONNECT TO '$dsn2' AS tbl2 Euh... you connect to a database, not to a table, no? -- Bart.
Re: RFC: SQL Extensions for SQL::Statement [Long]
On Thu, Jun 05, 2003 at 10:32:58PM +0200, Bart Lateur wrote: FWIW, MySQL allows the syntax SELECT * FROM database.table where table is the name of a table in the database with name database. I'm not sure how standard that is :) Much more standard is: SELECT * FROM schema.table MySQL doesn't have schemas, though, and their internal hierarchy makes that a reasonable substitute. -joh
Re: RFC: SQL Extensions for SQL::Statement [Long]
Bart Lateur wrote: On Wed, 04 Jun 2003 09:15:28 -0700, Jeff Zucker wrote: For per-table connections - CONNECT TO '$dsn1' AS tbl1 CONNECT TO '$dsn2' AS tbl2 Euh... you connect to a database, not to a table, no? Well, yes and no. The point of these, as distinct from a $dbh connection is that they are per-table connections. One will use a single AnyData $dbh to connect to multiple tables each of which may reside in a different database, in a different rdbms, in a different location. So I feel it is important to somehow emphasize that these are not the same kind of connection as we usually talk about in DBI. Well, ok, under the hood they are going to be $dbhs and you will be able to address them, but for the purposes of the SQL syntax they behave differently. Still, I bow to the collective wisdom and will follow Tim's (and yours and Dean's) advice to use CONNECT TO to refer to connections. Gosh you guys are picky. You probably won't like my new SELECT syntax that does a delete either. :-) FWIW, MySQL allows the syntax SELECT * FROM database.table Well ODBC has something similar with catalog and schema thrown in with the possibility that they are not all in the same rdbms or same location. Oracle uses the [EMAIL PROTECTED] format where link is a heterogeneous database. Both databases, the one you connect to, and the one mentioned here, must (?) reside on the same MySQL server, though. It's the only way I expect it to behave sanely, anyway. Bart, you should know by now that sanity is not one of my core values :-). But you're right, in most cases something like this (which will be quite ordinary in DBD::AnyData) looks looney: SELECT $cols FROM [EMAIL PROTECTED] JOIN [EMAIL PROTECTED] WHERE ... -- Jeff
Re: RFC: SQL Extensions for SQL::Statement [Long]
FWIW: years ago, an early SQL std. for distributed queries used dotted notation for everything (e.g., somehost.somecatalog.someschema.sometable.somecolumn). I don't know if thats still a part of the std or not...but having fiddled around with a similar concept, I concur with the '[EMAIL PROTECTED]' as a better, more obvious (and likely easier to safely parse) syntax. One potential gotcha to lookout for: SQL Server (MSFT and Sybase) use '@' in valid names (either system objects or procedures, can't recall which)... so you may need to be cautious with the parsing...I think I've only ever seen that at the beginning of an object name, so you should be able to trap it. Dean Arnold Presicient Corp. www.presicient.com - Original Message - From: Jeff Zucker [EMAIL PROTECTED] Cc: dbi-users [EMAIL PROTECTED] Sent: Thursday, June 05, 2003 1:57 PM Subject: Re: RFC: SQL Extensions for SQL::Statement [Long] Bart Lateur wrote: On Wed, 04 Jun 2003 09:15:28 -0700, Jeff Zucker wrote: For per-table connections - CONNECT TO '$dsn1' AS tbl1 CONNECT TO '$dsn2' AS tbl2 Euh... you connect to a database, not to a table, no? Well, yes and no. The point of these, as distinct from a $dbh connection is that they are per-table connections. One will use a single AnyData $dbh to connect to multiple tables each of which may reside in a different database, in a different rdbms, in a different location. So I feel it is important to somehow emphasize that these are not the same kind of connection as we usually talk about in DBI. Well, ok, under the hood they are going to be $dbhs and you will be able to address them, but for the purposes of the SQL syntax they behave differently. Still, I bow to the collective wisdom and will follow Tim's (and yours and Dean's) advice to use CONNECT TO to refer to connections. Gosh you guys are picky. You probably won't like my new SELECT syntax that does a delete either. :-) FWIW, MySQL allows the syntax SELECT * FROM database.table Well ODBC has something similar with catalog and schema thrown in with the possibility that they are not all in the same rdbms or same location. Oracle uses the [EMAIL PROTECTED] format where link is a heterogeneous database. Both databases, the one you connect to, and the one mentioned here, must (?) reside on the same MySQL server, though. It's the only way I expect it to behave sanely, anyway. Bart, you should know by now that sanity is not one of my core values :-). But you're right, in most cases something like this (which will be quite ordinary in DBD::AnyData) looks looney: SELECT $cols FROM [EMAIL PROTECTED] JOIN [EMAIL PROTECTED] WHERE ... -- Jeff
Re: RFC: SQL Extensions for SQL::Statement [Long]
On Wed, Jun 04, 2003 at 12:22:19PM -0700, Jeff Zucker wrote: Tim Bunce wrote: SELECT $cols FROM tbl1@$dsn1 ... JOIN tbl2@$dsn2 ... Both are a little ugly and prone to nested quote problems. Ew, nested quote problems, the bane of my existence. I guess I was seeing these as shortcuts for simple DSNs (disallowing nested quotes) and figuring more complex DSNs would be put in as placeholders (although, is allowing the dsn to be a placeholder SELECT foo from [EMAIL PROTECTED] WHERE ... a violation of the basic idea that placeholders are for values, not structure ... it is, but an awfully handy one in this case). I think we're very much in the realms of inventing our own syntax here. We're just using 'standard sql' as a guide for everyones sanity. For that perspective, using placeholders as you propose is just fine :) I think some form of naming would be handy: $dbh-{sql_sources} = { foo = 'dbi:Driver:dbname, # qualify with table name bar = $dbh,# qualify with table name baz = $sth,# implies select * from itself }; then: SELECT $cols FROM [EMAIL PROTECTED] JOIN [EMAIL PROTECTED] ... Well that hash will certainly exist and be addressable, but are you saying you don't like the CONNECT TO $dsn AS $table syntax? No. I'm just saying I dislike SELECT $cols FROM tbl1@$dsn1 Perhaps allow it but don't use it in examples :) Also, CONNECT TO $dsn AS $table seems rather confused between 'levels'. If it's just setting up a connection why mention table name? If I want to use 20 different tables in one db, do I need 20 separate CONNECT TO ... statements? I would hope not. So how about keeping CONNECT TO just for connections (I'll call them links here as that's Oracle's terminology for connections to remote databases): CONNECT TO $dsn AS mylinkname just sets up 'mylinkname' as an alias for the connection which can be used as SELECT ... FROM [EMAIL PROTECTED] and, if desired, also allow something like CREATE SYNONYM foo_table_name AS [EMAIL PROTECTED] (or some other syntax with the same effect) so SELECT ... FROM foo_table_name is same as saying SELECT ... FROM [EMAIL PROTECTED] ? [Note that CONNECT TO needs to handle username and password as well] It seems to me that making it a SQL statement allows for automating connections and processing from a file of SQL statements. Yes, I agree. I was wearing my implementation hat :) [FYI, I'm rather fond of the nested select syntax used by Oracle, mysql v4, and others: SELECT foo, bar FROM (SELECT ...) table_alias WHERE ... I love that syntax too and it's quite powerful. I'm not sure I'm clever enough to be able to handle that with SQL::Parser yet though. since nested parens are an even worse bane than nested quotes. (Patches and co-authors welcome!) Perhaps it's time to use a parser tool like Parse::RecDecent. Then nested whatevers just wouldn't be a problem. You'd also get better syntax error messages etc. For calling temporary stored procedures, functions, modules --- my $rv = $dbh-do(# Run a procedure or function CALL $procedure($args) ); print $dbh-selectrow_array(# print results of a function or SELECT function($args)# procedure (no FROM or WHERE clause) ); I'm not really sure what you're saying here, for those two. Well the first is basically the way other stored procedures work isn't it, although maybe I should use BEGIN ... END instead. And the second is for getting the results of a function without bind_param jiggery-pokery. I borrowed it form Postgres -- it allows a truncated SELECT with no FROM clause as a way of calling functions. Sure, I see that. But how does that relate to SQL::Parser supporting multiple data sources etc that the rest of your message is about? Thank you! And thank *you* for providing such a elegant and extendable framework without which none of this would have been possible. Thanks, but I'm just bumbling my way along one step at a time... Tim.
Re: RFC: SQL Extensions for SQL::Statement [Long]
Tim Bunce wrote: I'd like to be able to refer to existing dbh's as well. I'd also like to be able to treat an existing $sth as a 'table'. Yep. SELECT $cols FROM tbl1@$dsn1 ... JOIN tbl2@$dsn2 ... Both are a little ugly and prone to nested quote problems. Ew, nested quote problems, the bane of my existence. I guess I was seeing these as shortcuts for simple DSNs (disallowing nested quotes) and figuring more complex DSNs would be put in as placeholders (although, is allowing the dsn to be a placeholder SELECT foo from [EMAIL PROTECTED] WHERE ... a violation of the basic idea that placeholders are for values, not structure ... it is, but an awfully handy one in this case). I think some form of naming would be handy: $dbh-{sql_sources} = { foo = 'dbi:Driver:dbname, # qualify with table name bar = $dbh,# qualify with table name baz = $sth,# implies select * from itself }; then: SELECT $cols FROM [EMAIL PROTECTED] JOIN [EMAIL PROTECTED] ... Well that hash will certainly exist and be addressable, but are you saying you don't like the CONNECT TO $dsn AS $table syntax? It seems to me that making it a SQL statement allows for automating connections and processing from a file of SQL statements. [FYI, I'm rather fond of the nested select syntax used by Oracle, mysql v4, and others: SELECT foo, bar FROM (SELECT ...) table_alias WHERE ... I love that syntax too and it's quite powerful. I'm not sure I'm clever enough to be able to handle that with SQL::Parser yet though. since nested parens are an even worse bane than nested quotes. (Patches and co-authors welcome!) CREATE INTO TABLE is a new one on me. I think CREATE TABLE ... AS SELECT is more common. Are you specifically choosing non-standard syntax? Nope, I'll go with the standard whenever I can. I forget exactly why I chose that one :-). I'll go back and review my thinking. For calling temporary stored procedures, functions, modules --- my $rv = $dbh-do(# Run a procedure or function CALL $procedure($args) ); print $dbh-selectrow_array(# print results of a function or SELECT function($args)# procedure (no FROM or WHERE clause) ); I'm not really sure what you're saying here, for those two. Well the first is basically the way other stored procedures work isn't it, although maybe I should use BEGIN ... END instead. And the second is for getting the results of a function without bind_param jiggery-pokery. I borrowed it form Postgres -- it allows a truncated SELECT with no FROM clause as a way of calling functions. my $sth=$dbh-prepare( SELECT $cols FROM @? WHERE ..., {}, {format='AoA',source=$data_array_ref} ); $data_array_ref should be able to be an $sth. I guess I rejected that when I first tried it because it seemed confused (an sth as a parameter to an sth), but it could be a very easy way to get nested subqueries without the parsing headache since the two sth's would be parsed separately. Thank you! And thank *you* for providing such a elegant and extendable framework without which none of this would have been possible. -- Jeff
RFC: SQL Extensions for SQL::Statement [Long]
I will be releasing a significantly upgraded SQL::Statement and DBD::File shortly and I have some questions about interface. I'd really appreciate some feedback. These are the features that are near finalization: * heterogeneous SQL across multiple DBI sources * per-table DBI connections on a single dbh * improved API for DBD writers * support for in-memory tables ala DBD::RAM for all SQL::Statement DBDs * improved ability to specify SQL dialect features on a per-DBD, per-session, or per-statement basis * user-supplied functions, procedures, and persistent stored modules For heterogeneous DBI sources, all statements except joins are executed directly by the relevant DBD after a pre-parse by SQL::Statement so should be close to as fast as the DBD/rdbms itself. For joins, the physical fetches are handled by the underlying DBDs but the join and where_clause evaluation are handled by SQL::Statement so these will be slow as sin. The slowness applies only to joins, not to non-join multi-table subqueries which should be close to as fast as similar statements executed directly in the relevant DBDs. In terms of the interface to all of this, I am leaning toward SQL extensions and away from $dbh-func() calls and $dbh-{attr} settings but I'm open to all suggestions. This means support for the following kinds of constructs (though the [EMAIL PROTECTED] form may be too much): For control over SQL dialect features - CREATE/DROP TYPE CREATE/DROP KEYWORD CREATE/DROP FUNCTION For per-table connections - CONNECT TO '$dsn1' AS tbl1 CONNECT TO '$dsn2' AS tbl2 SELECT $cols FROM tbl1 ... JOIN tbl2 ... and/or SELECT $cols FROM tbl1@$dsn1 ... JOIN tbl2@$dsn2 ... and/or, for DSNs that can't be reduced to strings $dbh-prepare(q{ SELECT $cols FROM [EMAIL PROTECTED] ... JOIN [EMAIL PROTECTED] ... } ,{} ,$dsn1 ,$dsn2 ); For importing, exporting, converting heterogeneous sources -- convert data from one rdbms to another CREATE INTO TABLE x@$dsn1 ($new_col_defs) SELECT $oldcols FROM y@$dsn2 WHERE ... SELECT INTO x@$dsn1 ($newcols) $oldcols FROM y@$dsn2 WHERE ... import/export in-memory tables CREATE INTO TABLE [EMAIL PROTECTED] SELECT $cols FROM y@$dsn2 ... CREATE INTO TABLE x@$dsn2 SELECT $cols FROM [EMAIL PROTECTED] ... For calling temporary stored procedures, functions, modules --- my $rv = $dbh-do(# Run a procedure or function CALL $procedure($args) ); print $dbh-selectrow_array(# print results of a function or SELECT function($args)# procedure (no FROM or WHERE clause) ); $dbh-do(LOAD psmfile); # Load run SQL dialect commands, # or DDL, or procedures or # CONNECTION setings For printing out resultsets in formats -- print $dbh-selectrow_array( SELECT INTO @$format $cols FROM ... ); note: $format could be any format that is based on SQL::Statement, like CSV, AnyData::Fixed, AnyData::XML, etc. or it could be a coderef to a user-supplied routine that accepts a table in AoA and outputs a formatted string For directly querying strings, AoAs, and AoHs - my $sth=$dbh-prepare( SELECT $cols FROM @? WHERE ..., {}, {format='AoA',source=$data_array_ref} ); All of this infrstructure is really just an excuse to be able to write the JAPH in my sig. :-) Thanks for any comments. -- Jeff perl -MDBI -e 'print DBI-connect(dbi:AnyData:)-selectrow_array( SELECT DISTINCT col0 FROM @? ORDER BY col1,{},{format=AoA, source=[[H,4],[A,2],[J,1],[J,1],[P,3]]})'
Re: RFC: SQL Extensions for SQL::Statement [Long]
I've set my Reply-to to just [EMAIL PROTECTED] as I'm not fond of cross-posting between the two lists. So could people reply just to [EMAIL PROTECTED] please. Thanks. On Wed, Jun 04, 2003 at 09:15:28AM -0700, Jeff Zucker wrote: I will be releasing a significantly upgraded SQL::Statement and DBD::File shortly and I have some questions about interface. I'd really appreciate some feedback. These are the features that are near finalization: * heterogeneous SQL across multiple DBI sources * per-table DBI connections on a single dbh * improved API for DBD writers * support for in-memory tables ala DBD::RAM for all SQL::Statement DBDs * improved ability to specify SQL dialect features on a per-DBD, per-session, or per-statement basis * user-supplied functions, procedures, and persistent stored modules For heterogeneous DBI sources, all statements except joins are executed directly by the relevant DBD after a pre-parse by SQL::Statement so should be close to as fast as the DBD/rdbms itself. For joins, the physical fetches are handled by the underlying DBDs but the join and where_clause evaluation are handled by SQL::Statement so these will be slow as sin. The slowness applies only to joins, not to non-join multi-table subqueries which should be close to as fast as similar statements executed directly in the relevant DBDs. In terms of the interface to all of this, I am leaning toward SQL extensions and away from $dbh-func() calls and $dbh-{attr} settings but I'm open to all suggestions. (Note that func() isn't needed now the DBI has exposed install_method().) This means support for the following kinds of constructs (though the [EMAIL PROTECTED] form may be too much): For control over SQL dialect features - CREATE/DROP TYPE CREATE/DROP KEYWORD CREATE/DROP FUNCTION For per-table connections - CONNECT TO '$dsn1' AS tbl1 CONNECT TO '$dsn2' AS tbl2 SELECT $cols FROM tbl1 ... JOIN tbl2 ... I'd like to be able to refer to existing dbh's as well. I'd also like to be able to treat an existing $sth as a 'table'. and/or SELECT $cols FROM tbl1@$dsn1 ... JOIN tbl2@$dsn2 ... Both are a little ugly and prone to nested quote problems. I think some form of naming would be handy: $dbh-{sql_sources} = { foo = 'dbi:Driver:dbname, # qualify with table name bar = $dbh,# qualify with table name baz = $sth,# implies select * from itself }; then: SELECT $cols FROM [EMAIL PROTECTED] JOIN [EMAIL PROTECTED] ... and/or, for DSNs that can't be reduced to strings $dbh-prepare(q{ SELECT $cols FROM [EMAIL PROTECTED] ... JOIN [EMAIL PROTECTED] ... } ,{} ,$dsn1 ,$dsn2 ); [FYI, I'm rather fond of the nested select syntax used by Oracle, mysql v4, and others: SELECT foo, bar FROM (SELECT ...) table_alias WHERE ... also, I recently saw that the SQL99 standard allows WITH RECURSIVE Q1 AS SELECT ... Q2 AS SELECT ... SELECT ... FROM Q1, Q2 WHERE ... which might be a useful style. But the [EMAIL PROTECTED] is fine.] For importing, exporting, converting heterogeneous sources -- convert data from one rdbms to another CREATE INTO TABLE x@$dsn1 ($new_col_defs) SELECT $oldcols FROM y@$dsn2 WHERE ... CREATE INTO TABLE is a new one on me. I think CREATE TABLE ... AS SELECT is more common. Are you specifically choosing non-standard syntax? SELECT INTO x@$dsn1 ($newcols) $oldcols FROM y@$dsn2 WHERE ... Same here, INSERT INTO ... SELECT is more common. For calling temporary stored procedures, functions, modules --- my $rv = $dbh-do(# Run a procedure or function CALL $procedure($args) ); print $dbh-selectrow_array(# print results of a function or SELECT function($args)# procedure (no FROM or WHERE clause) ); I'm not really sure what you're saying here, for those two. For printing out resultsets in formats -- print $dbh-selectrow_array( SELECT INTO @$format $cols FROM ... ); note: $format could be any format that is based on SQL::Statement, like CSV, AnyData::Fixed, AnyData::XML, etc. or it could be a coderef to a user-supplied routine that accepts a table in AoA and outputs a formatted string I think the nested select syntax may be more natural/obvious here: SELECT format_as_foo(table.*) FROM (SELECT ...) table but maybe that doesn't fit with with the [EMAIL PROTECTED] style