Re: RFC: SQL Extensions for SQL::Statement [Long]

2003-06-06 Thread Bart Lateur
On Thu, 5 Jun 2003 10:50:45 +0100, Tim Bunce wrote:

[quoting from the root message:]

  SELECT $cols FROM tbl1@$dsn1 ... JOIN tbl2@$dsn2 ...

I think we're very much in the realms of inventing our own syntax here.
We're just using 'standard sql' as a guide for everyones sanity.

FWIW, MySQL allows the syntax

SELECT * FROM database.table

where table is the name of a table in the database with name
database. I'm not sure how standard that is :)

Both databases, the one you connect to, and the one mentioned here,
must (?) reside on the same MySQL server, though. It's the only way I
expect it to behave sanely, anyway.

-- 
Bart.


Re: RFC: SQL Extensions for SQL::Statement [Long]

2003-06-06 Thread Bart Lateur
On Wed, 04 Jun 2003 09:15:28 -0700, Jeff Zucker wrote:

  For per-table connections
  -
  CONNECT TO '$dsn1' AS tbl1
  CONNECT TO '$dsn2' AS tbl2

Euh... you connect to a database, not to a table, no?

-- 
Bart.


Re: RFC: SQL Extensions for SQL::Statement [Long]

2003-06-06 Thread johnnnnnn
On Thu, Jun 05, 2003 at 10:32:58PM +0200, Bart Lateur wrote:
 FWIW, MySQL allows the syntax
 
   SELECT * FROM database.table
 
 where table is the name of a table in the database with name
 database. I'm not sure how standard that is :)

Much more standard is:

  SELECT * FROM schema.table

MySQL doesn't have schemas, though, and their internal hierarchy makes
that a reasonable substitute.

-joh



Re: RFC: SQL Extensions for SQL::Statement [Long]

2003-06-06 Thread Jeff Zucker
Bart Lateur wrote:

On Wed, 04 Jun 2003 09:15:28 -0700, Jeff Zucker wrote:

 

For per-table connections
-
CONNECT TO '$dsn1' AS tbl1
CONNECT TO '$dsn2' AS tbl2
   

Euh... you connect to a database, not to a table, no?

Well, yes and no.  The point of these, as distinct from a $dbh 
connection is that they are per-table connections.  One will use a 
single AnyData $dbh to connect to multiple tables each of which may 
reside in a different database, in a different rdbms, in a different 
location.  So I feel it is important to somehow emphasize that these are 
not the same kind of connection as we usually talk about in DBI.  Well, 
ok, under the hood they are going to be $dbhs and you will be able to 
address them, but for the purposes of the SQL syntax they behave 
differently.

Still, I bow to the collective wisdom and will follow Tim's (and yours 
and Dean's) advice to use CONNECT TO to refer to connections.  Gosh you 
guys are picky.  You probably won't like my new SELECT syntax that does 
a delete either.  :-)

FWIW, MySQL allows the syntax
SELECT * FROM database.table
Well ODBC has something similar with catalog and schema thrown in with 
the possibility that they are not all in the same rdbms or same 
location.  Oracle uses the [EMAIL PROTECTED] format where link is a 
heterogeneous database. 

Both databases, the one you connect to, and the one mentioned here,
must (?) reside on the same MySQL server, though. It's the only way I
expect it to behave sanely, anyway.
Bart, you should know by now that sanity is not one of my core values :-).  But you're right, in most cases something like this (which will be quite ordinary in DBD::AnyData) looks looney:

 SELECT $cols FROM [EMAIL PROTECTED] JOIN [EMAIL PROTECTED] WHERE ...

--
Jeff







Re: RFC: SQL Extensions for SQL::Statement [Long]

2003-06-06 Thread Dean Arnold
FWIW: years ago, an early SQL std. for distributed queries used
dotted notation for everything (e.g., 
somehost.somecatalog.someschema.sometable.somecolumn).
I don't know if thats still a part of the std or not...but having fiddled around
with a similar concept, I concur with the '[EMAIL PROTECTED]'
as a better, more obvious (and likely easier to safely parse) syntax.

One potential gotcha to lookout for: SQL Server (MSFT and Sybase) use '@'
in valid names (either system objects or procedures, can't recall which)...
so you may need to be cautious with the parsing...I think I've only
ever seen that at the beginning of an object name, so you should be
able to trap it.

Dean Arnold
Presicient Corp.
www.presicient.com

- Original Message - 
From: Jeff Zucker [EMAIL PROTECTED]
Cc: dbi-users [EMAIL PROTECTED]
Sent: Thursday, June 05, 2003 1:57 PM
Subject: Re: RFC: SQL Extensions for SQL::Statement [Long]


 Bart Lateur wrote:

 On Wed, 04 Jun 2003 09:15:28 -0700, Jeff Zucker wrote:
 
 
 
  For per-table connections
  -
  CONNECT TO '$dsn1' AS tbl1
  CONNECT TO '$dsn2' AS tbl2
 
 
 
 Euh... you connect to a database, not to a table, no?
 

 Well, yes and no.  The point of these, as distinct from a $dbh
 connection is that they are per-table connections.  One will use a
 single AnyData $dbh to connect to multiple tables each of which may
 reside in a different database, in a different rdbms, in a different
 location.  So I feel it is important to somehow emphasize that these are
 not the same kind of connection as we usually talk about in DBI.  Well,
 ok, under the hood they are going to be $dbhs and you will be able to
 address them, but for the purposes of the SQL syntax they behave
 differently.

 Still, I bow to the collective wisdom and will follow Tim's (and yours
 and Dean's) advice to use CONNECT TO to refer to connections.  Gosh you
 guys are picky.  You probably won't like my new SELECT syntax that does
 a delete either.  :-)

  FWIW, MySQL allows the syntax
  SELECT * FROM database.table

 Well ODBC has something similar with catalog and schema thrown in with
 the possibility that they are not all in the same rdbms or same
 location.  Oracle uses the [EMAIL PROTECTED] format where link is a
 heterogeneous database.

  Both databases, the one you connect to, and the one mentioned here,
  must (?) reside on the same MySQL server, though. It's the only way I
  expect it to behave sanely, anyway.

 Bart, you should know by now that sanity is not one of my core values :-).  But 
 you're right, in
most cases something like this (which will be quite ordinary in DBD::AnyData) looks 
looney:

   SELECT $cols FROM [EMAIL PROTECTED] JOIN [EMAIL PROTECTED] WHERE ...


 -- 
 Jeff








Re: RFC: SQL Extensions for SQL::Statement [Long]

2003-06-05 Thread Tim Bunce
On Wed, Jun 04, 2003 at 12:22:19PM -0700, Jeff Zucker wrote:
 Tim Bunce wrote:
 
  SELECT $cols FROM tbl1@$dsn1 ... JOIN tbl2@$dsn2 ...
 
 Both are a little ugly and prone to nested quote problems.

 Ew, nested quote problems, the bane of my existence.  I guess I was 
 seeing these as shortcuts for simple DSNs (disallowing nested quotes) 
 and figuring more complex DSNs would be put in as placeholders 
 (although, is allowing the dsn to be a placeholder SELECT foo from 
 [EMAIL PROTECTED] WHERE  ... a violation of the basic idea that placeholders are 
 for values, not structure ... it is, but an awfully handy one in this case).

I think we're very much in the realms of inventing our own syntax here.
We're just using 'standard sql' as a guide for everyones sanity.
For that perspective, using placeholders as you propose is just fine :)

 I think some form of naming would be handy:
 
  $dbh-{sql_sources} = {
  foo = 'dbi:Driver:dbname, # qualify with table name
  bar = $dbh,# qualify with table name
  baz = $sth,# implies select * from itself
  };
 then:
   SELECT $cols FROM [EMAIL PROTECTED] JOIN [EMAIL PROTECTED] ...

 Well that hash will certainly exist and be addressable, but are you 
 saying you don't like the  CONNECT TO $dsn AS $table syntax?

No. I'm just saying I dislike SELECT $cols FROM tbl1@$dsn1
Perhaps allow it but don't use it in examples :)

Also, CONNECT TO $dsn AS $table seems rather confused between
'levels'. If it's just setting up a connection why mention table name?
If I want to use 20 different tables in one db, do I need 20 separate
CONNECT TO ... statements?  I would hope not.

So how about keeping CONNECT TO just for connections (I'll call
them links here as that's Oracle's terminology for connections to
remote databases):

CONNECT TO $dsn AS mylinkname

just sets up 'mylinkname' as an alias for the connection which can be used as

SELECT ... FROM [EMAIL PROTECTED]

and, if desired, also allow something like

CREATE SYNONYM foo_table_name AS [EMAIL PROTECTED]

(or some other syntax with the same effect) so

SELECT ... FROM foo_table_name

is same as saying

SELECT ... FROM [EMAIL PROTECTED]

?

[Note that CONNECT TO needs to handle username and password as well]

 It seems to me that making it a SQL statement allows for automating
 connections and processing from a file of SQL statements.

Yes, I agree. I was wearing my implementation hat :)

 [FYI, I'm rather fond of the nested select syntax used by Oracle,
 mysql v4, and others:
 
SELECT foo, bar FROM (SELECT ...) table_alias WHERE ... 

 I love that syntax too and it's quite powerful.  I'm not sure I'm clever 
 enough to be able to handle that with SQL::Parser yet though. since 
 nested parens are an even worse bane than nested quotes.  (Patches and 
 co-authors welcome!)

Perhaps it's time to use a parser tool like Parse::RecDecent. Then
nested whatevers just wouldn't be a problem.  You'd also get better
syntax error messages etc.


 For calling temporary  stored procedures, functions,  modules
 ---
 
 my $rv = $dbh-do(# Run a procedure or function
 CALL $procedure($args)
 );
 print $dbh-selectrow_array(# print results of a function or
 SELECT function($args)# procedure (no FROM or WHERE clause)
 );
 
 I'm not really sure what you're saying here, for those two.
 
 Well the first is basically the way other stored procedures work isn't 
 it, although maybe I should  use BEGIN ... END instead.  And the second 
 is for getting the results of a function without bind_param 
 jiggery-pokery.  I borrowed it form Postgres  -- it allows a truncated 
 SELECT with no FROM clause as a way of calling functions.

Sure, I see that. But how does that relate to SQL::Parser supporting
multiple data sources etc that the rest of your message is about?


 Thank you!

 And thank *you* for providing such a elegant and extendable framework 
 without which none of this would have been possible.

Thanks, but I'm just bumbling my way along one step at a time...

Tim.


Re: RFC: SQL Extensions for SQL::Statement [Long]

2003-06-05 Thread Jeff Zucker
Tim Bunce wrote:

I'd like to be able to refer to existing dbh's as well.
I'd also like to be able to treat an existing $sth as a 'table'.
Yep.

 SELECT $cols FROM tbl1@$dsn1 ... JOIN tbl2@$dsn2 ...
   

Both are a little ugly and prone to nested quote problems.

Ew, nested quote problems, the bane of my existence.  I guess I was 
seeing these as shortcuts for simple DSNs (disallowing nested quotes) 
and figuring more complex DSNs would be put in as placeholders 
(although, is allowing the dsn to be a placeholder SELECT foo from 
[EMAIL PROTECTED] WHERE  ... a violation of the basic idea that placeholders are 
for values, not structure ... it is, but an awfully handy one in this case).

I think some form of naming would be handy:

$dbh-{sql_sources} = {
foo = 'dbi:Driver:dbname, # qualify with table name
bar = $dbh,# qualify with table name
baz = $sth,# implies select * from itself
};
then:
  SELECT $cols FROM [EMAIL PROTECTED] JOIN [EMAIL PROTECTED] ...
Well that hash will certainly exist and be addressable, but are you 
saying you don't like the  CONNECT TO $dsn AS $table syntax?  It seems 
to me that making it a SQL statement allows for automating connections 
and processing from a file of SQL statements.

[FYI, I'm rather fond of the nested select syntax used by Oracle,
mysql v4, and others:
   SELECT foo, bar FROM (SELECT ...) table_alias WHERE ... 

I love that syntax too and it's quite powerful.  I'm not sure I'm clever 
enough to be able to handle that with SQL::Parser yet though. since 
nested parens are an even worse bane than nested quotes.  (Patches and 
co-authors welcome!)

CREATE INTO TABLE is a new one on me. I think CREATE TABLE ... AS SELECT
is more common. Are you specifically choosing non-standard syntax?
Nope, I'll go with the standard whenever I can.  I forget exactly why I 
chose that one :-).  I'll go back and review my thinking.

For calling temporary  stored procedures, functions,  modules
---
my $rv = $dbh-do(# Run a procedure or function
CALL $procedure($args)
);
print $dbh-selectrow_array(# print results of a function or
SELECT function($args)# procedure (no FROM or WHERE clause)
);
I'm not really sure what you're saying here, for those two.

Well the first is basically the way other stored procedures work isn't 
it, although maybe I should  use BEGIN ... END instead.  And the second 
is for getting the results of a function without bind_param 
jiggery-pokery.  I borrowed it form Postgres  -- it allows a truncated 
SELECT with no FROM clause as a way of calling functions.

my $sth=$dbh-prepare(
SELECT $cols FROM @? WHERE ...,
{},
{format='AoA',source=$data_array_ref}
);
$data_array_ref
should be able to be an $sth.
I guess I rejected that when I first tried it because it seemed confused 
(an sth as a parameter to an sth), but it could be a very easy way to 
get nested subqueries without the parsing headache since the two sth's 
would be parsed separately.

Thank you!
 

And thank *you* for providing such a elegant and extendable framework 
without which none of this would have been possible.

--
Jeff


RFC: SQL Extensions for SQL::Statement [Long]

2003-06-05 Thread Jeff Zucker
I will be releasing a significantly upgraded SQL::Statement and 
DBD::File shortly and I have some questions about interface.  I'd really 
appreciate some feedback.  These are the features that are near 
finalization:

 * heterogeneous SQL across multiple DBI sources
 * per-table DBI connections on a single dbh
 * improved API for DBD writers
 * support for in-memory tables ala DBD::RAM for all SQL::Statement DBDs
 * improved ability to specify SQL dialect features on a per-DBD,
   per-session, or per-statement basis
 * user-supplied functions, procedures, and persistent stored modules
For heterogeneous DBI sources, all statements except joins are executed 
directly by the relevant DBD after a pre-parse by SQL::Statement so 
should be close to as fast as the DBD/rdbms itself.  For joins, the 
physical fetches are handled by the underlying DBDs but the join and 
where_clause evaluation are handled by SQL::Statement so these will be 
slow as sin.  The slowness applies only to joins, not to non-join 
multi-table subqueries which should be close to as fast as similar 
statements executed directly in the relevant DBDs.

In terms of the interface to all of this, I am leaning toward SQL 
extensions and away from $dbh-func() calls and $dbh-{attr} settings 
but I'm open to all suggestions.  This means support for the following 
kinds of constructs (though the [EMAIL PROTECTED] form may be too much):

 For control over SQL dialect features
 -
 CREATE/DROP TYPE
 CREATE/DROP KEYWORD
 CREATE/DROP FUNCTION
 For per-table connections
 -
 CONNECT TO '$dsn1' AS tbl1
 CONNECT TO '$dsn2' AS tbl2
 SELECT $cols FROM tbl1 ... JOIN tbl2 ...
   and/or

 SELECT $cols FROM tbl1@$dsn1 ... JOIN tbl2@$dsn2 ...

   and/or, for DSNs that can't be reduced to strings

 $dbh-prepare(q{ SELECT $cols FROM [EMAIL PROTECTED] ... JOIN [EMAIL PROTECTED] 
... }
  ,{}
  ,$dsn1
  ,$dsn2
  );
 For importing, exporting, converting heterogeneous sources
 --
 convert data from one rdbms to another
CREATE INTO TABLE x@$dsn1 ($new_col_defs)
  SELECT $oldcols FROM y@$dsn2 WHERE ...
SELECT INTO x@$dsn1 ($newcols)
$oldcols FROM y@$dsn2 WHERE ...
 import/export in-memory tables

CREATE INTO TABLE [EMAIL PROTECTED] SELECT $cols FROM y@$dsn2 ...

CREATE INTO TABLE x@$dsn2 SELECT $cols FROM [EMAIL PROTECTED] ...

 For calling temporary  stored procedures, functions,  modules
 ---
 my $rv = $dbh-do(# Run a procedure or function
 CALL $procedure($args)
 );
 print $dbh-selectrow_array(# print results of a function or
 SELECT function($args)# procedure (no FROM or WHERE clause)
 );
 
 $dbh-do(LOAD psmfile); # Load  run SQL dialect commands,
   # or DDL, or procedures or
   # CONNECTION setings

 For printing out resultsets in formats
 --
 print $dbh-selectrow_array(
 SELECT INTO @$format $cols FROM ...
 );
 note: $format could be any format that is based on SQL::Statement, 
like
   CSV, AnyData::Fixed, AnyData::XML, etc. or it could be a coderef
   to a user-supplied routine that accepts a table in AoA and
   outputs a  formatted string

 For directly querying strings, AoAs, and AoHs
 -
 my $sth=$dbh-prepare(
 SELECT $cols FROM @? WHERE ...,
 {},
 {format='AoA',source=$data_array_ref}
 );
All of this infrstructure is really just an excuse to be able to write 
the JAPH in my sig. :-)

Thanks for any comments.

--
Jeff
perl -MDBI -e 'print DBI-connect(dbi:AnyData:)-selectrow_array(
SELECT DISTINCT col0 FROM @? ORDER BY col1,{},{format=AoA,
source=[[H,4],[A,2],[J,1],[J,1],[P,3]]})'





Re: RFC: SQL Extensions for SQL::Statement [Long]

2003-06-05 Thread Tim Bunce
I've set my Reply-to to just [EMAIL PROTECTED] as I'm not fond of
cross-posting between the two lists. So could people reply just to
[EMAIL PROTECTED] please. Thanks.

On Wed, Jun 04, 2003 at 09:15:28AM -0700, Jeff Zucker wrote:
 I will be releasing a significantly upgraded SQL::Statement and 
 DBD::File shortly and I have some questions about interface.  I'd really 
 appreciate some feedback.  These are the features that are near 
 finalization:
 
  * heterogeneous SQL across multiple DBI sources
  * per-table DBI connections on a single dbh
  * improved API for DBD writers
  * support for in-memory tables ala DBD::RAM for all SQL::Statement DBDs
  * improved ability to specify SQL dialect features on a per-DBD,
per-session, or per-statement basis
  * user-supplied functions, procedures, and persistent stored modules
 
 For heterogeneous DBI sources, all statements except joins are executed 
 directly by the relevant DBD after a pre-parse by SQL::Statement so 
 should be close to as fast as the DBD/rdbms itself.  For joins, the 
 physical fetches are handled by the underlying DBDs but the join and 
 where_clause evaluation are handled by SQL::Statement so these will be 
 slow as sin.  The slowness applies only to joins, not to non-join 
 multi-table subqueries which should be close to as fast as similar 
 statements executed directly in the relevant DBDs.
 
 In terms of the interface to all of this, I am leaning toward SQL 
 extensions and away from $dbh-func() calls and $dbh-{attr} settings 
 but I'm open to all suggestions.

(Note that func() isn't needed now the DBI has exposed install_method().)

 This means support for the following 
 kinds of constructs (though the [EMAIL PROTECTED] form may be too much):
 
  For control over SQL dialect features
  -
  CREATE/DROP TYPE
  CREATE/DROP KEYWORD
  CREATE/DROP FUNCTION
 
  For per-table connections
  -
  CONNECT TO '$dsn1' AS tbl1
  CONNECT TO '$dsn2' AS tbl2
  SELECT $cols FROM tbl1 ... JOIN tbl2 ...

I'd like to be able to refer to existing dbh's as well.
I'd also like to be able to treat an existing $sth as a 'table'.

and/or
 
  SELECT $cols FROM tbl1@$dsn1 ... JOIN tbl2@$dsn2 ...

Both are a little ugly and prone to nested quote problems.
I think some form of naming would be handy:

$dbh-{sql_sources} = {
foo = 'dbi:Driver:dbname, # qualify with table name
bar = $dbh,# qualify with table name
baz = $sth,# implies select * from itself
};
then:
   SELECT $cols FROM [EMAIL PROTECTED] JOIN [EMAIL PROTECTED] ...

and/or, for DSNs that can't be reduced to strings
 
  $dbh-prepare(q{ SELECT $cols FROM [EMAIL PROTECTED] ... JOIN [EMAIL PROTECTED] 
 ... }
   ,{}
   ,$dsn1
   ,$dsn2
   );

[FYI, I'm rather fond of the nested select syntax used by Oracle,
mysql v4, and others:

SELECT foo, bar FROM (SELECT ...) table_alias WHERE ... 

also, I recently saw that the SQL99 standard allows

WITH RECURSIVE
Q1 AS SELECT ...
Q2 AS SELECT ...
SELECT ... FROM Q1, Q2 WHERE ...

which might be a useful style. But the [EMAIL PROTECTED] is fine.]



  For importing, exporting, converting heterogeneous sources
  --
  convert data from one rdbms to another
 
 CREATE INTO TABLE x@$dsn1 ($new_col_defs)
   SELECT $oldcols FROM y@$dsn2 WHERE ...

CREATE INTO TABLE is a new one on me. I think CREATE TABLE ... AS SELECT
is more common. Are you specifically choosing non-standard syntax?

 SELECT INTO x@$dsn1 ($newcols)
 $oldcols FROM y@$dsn2 WHERE ...

Same here, INSERT INTO ... SELECT is more common.

  For calling temporary  stored procedures, functions,  modules
  ---
 
  my $rv = $dbh-do(# Run a procedure or function
  CALL $procedure($args)
  );
  print $dbh-selectrow_array(# print results of a function or
  SELECT function($args)# procedure (no FROM or WHERE clause)
  );

I'm not really sure what you're saying here, for those two.


  For printing out resultsets in formats
  --
 
  print $dbh-selectrow_array(
  SELECT INTO @$format $cols FROM ...
  );
 
  note: $format could be any format that is based on SQL::Statement, 
 like
CSV, AnyData::Fixed, AnyData::XML, etc. or it could be a coderef
to a user-supplied routine that accepts a table in AoA and
outputs a  formatted string

I think the nested select syntax may be more natural/obvious here:

SELECT format_as_foo(table.*) FROM (SELECT ...) table

but maybe that doesn't fit with with the [EMAIL PROTECTED] style