Re: Recursive CTE Support in Drill

2015-08-27 Thread Daniel Barclay
Ted Dunning wrote: The cartesian join approach will produce an enormous stream of data with only a very small amount of disk read. You don't even need an external seed file (or to query a built-in table), and using WITH can extend multiplication to exponentiation: WITH q(key) AS ( WITH

Re: Recursive CTE Support in Drill

2015-07-20 Thread Alexander Zarei
Thanks for more elaboration Ted, Jacques and Jason! @Ted that is a very cool idea. I tried the cross join but figured cross join is not supported in drill yet but we have DRILL-786 for it. The new method looks very promising. It seems it is an implicit cross join, isn't it? I just tried it out

Re: Recursive CTE Support in Drill

2015-07-18 Thread Jacques Nadeau
Good point. In fact, you can just use a literal expression and some sample data such tpch lineitem: SELECT * FROM (select l_orderkey, l_shipdate, l_commitdate, l_shipmode, 1 as join_key from cp.`tpch/lineitem.parquet`) t1 JOIN (select l_orderkey, l_shipdate, l_commitdate, l_shipmode, 1 as

Re: Recursive CTE Support in Drill

2015-07-16 Thread Ted Dunning
Also, just doing a Cartesian join of three copies of 1000 records will give you a billion records with negligible I/o. Sent from my iPhone On Jul 16, 2015, at 15:43, Jason Altekruse altekruseja...@gmail.com wrote: @Alexander If you want to test the speed of the ODBC driver you can do that

Re: Recursive CTE Support in Drill

2015-07-16 Thread Alexander Zarei
Thanks for the answers. @Ted my only goal is to pump a large amount of data without having to read from Hard Disk. I am measuring the ODBC driver performance and I need a higher data transfer rate. So any method that helps pumping data out of Drill faster would help. The log-synth seems a good

Re: Recursive CTE Support in Drill

2015-07-16 Thread Jason Altekruse
@Alexander If you want to test the speed of the ODBC driver you can do that without a new storage plugin. If you get the entire dataset into memory, it will be returned from Drill a quickly as we can possibly send it to the client. One way to do this is to insert a sort; we cannot send along any

Re: Recursive CTE Support in Drill

2015-07-10 Thread Abdel Hakim Deneche
Yeah, we still lack documentation on how to write a storage plugin. One advice I've been seeing a lot is to take a look at the mongo-db plugin, it was basically added in one single commit: https://github.com/apache/drill/commit/2ca9c907bff639e08a561eac32e0acab3a0b3304 I think this will give some

Re: Recursive CTE Support in Drill

2015-07-10 Thread Ted Dunning
I don't think we need a full on storage plugin. I think a data format should be sufficient, basically CSV on steroids. On Fri, Jul 10, 2015 at 10:47 AM, Abdel Hakim Deneche adene...@maprtech.com wrote: Yeah, we still lack documentation on how to write a storage plugin. One advice I've

Re: Recursive CTE Support in Drill

2015-07-10 Thread Ted Dunning
Hakim, Not yet. Still very much in the stage of gathering feedback. I would think it very simple. The biggest obstacles are 1) no documentation on how to write a data format 2) I need to release a jar for log-synth to Maven Central. On Fri, Jul 10, 2015 at 8:17 AM, Abdel Hakim Deneche

Re: Recursive CTE Support in Drill

2015-07-10 Thread Ted Dunning
It may be easy, but it is completely opaque about what really needs to happen. For instance, 1) how is schema exposed? 2) which classes do I really need to implement? 3) how do I express partitioning of a format? 4) how do I test it? Just a bit of documentation and comments would go a very,

Fwd: Recursive CTE Support in Drill

2015-07-09 Thread Alexander Zarei
+ user@drill Hi All, I am trying to come up with a query which returns a given number of rows without having a real table on Storage. I am hoping to achieve something like this: http://stackoverflow.com/questions/6533524/sql-select-n-records-without-a-table DECLARE @start INT = 1;DECLARE @end

Re: Recursive CTE Support in Drill

2015-07-09 Thread Ted Dunning
Are you hard set on using common table expressions? I have discussed a bit off-list creating a data format that would allow tables to be read from a log-synth [1] schema. That would let you read as much data as you might like with an arbitrarily complex (or simple) query. Operationally, you