It may be easy, but it is completely opaque about what really needs to happen.
For instance, 1) how is schema exposed? 2) which classes do I really need to implement? 3) how do I express partitioning of a format? 4) how do I test it? Just a bit of documentation and comments would go a very, very long way. Even answers on the mailing list that have more details than "oh, that's easy". I would be happy to transcribe answers into the code if I could just get some. On Fri, Jul 10, 2015 at 11:04 AM, Jacques Nadeau <jacq...@apache.org> wrote: > Creating an EasyFormatPlugin is pretty simple. They were designed to get > rid of much of the scaffolding required for a standard FormatPlugin. > > JSON > > https://github.com/apache/drill/tree/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json > > Text > > https://github.com/apache/drill/tree/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text > > AVRO > > https://github.com/apache/drill/tree/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/avro > > In all cases, the connection code is pretty light. A fully schematized > format like log-synth should be even simpler to implement. > > On Fri, Jul 10, 2015 at 10:58 AM, Ted Dunning <ted.dunn...@gmail.com> > wrote: > > > I don't think we need a full on storage plugin. I think a data format > > should be sufficient, basically CSV on steroids. > > > > > > > > > > > > On Fri, Jul 10, 2015 at 10:47 AM, Abdel Hakim Deneche < > > adene...@maprtech.com > > > wrote: > > > > > Yeah, we still lack documentation on how to write a storage plugin. One > > > advice I've been seeing a lot is to take a look at the mongo-db plugin, > > it > > > was basically added in one single commit: > > > > > > > > > > > > https://github.com/apache/drill/commit/2ca9c907bff639e08a561eac32e0acab3a0b3304 > > > > > > I think this will give some general ideas on what to expect when > writing > > a > > > storage plugin. > > > > > > On Fri, Jul 10, 2015 at 9:10 AM, Ted Dunning <ted.dunn...@gmail.com> > > > wrote: > > > > > > > Hakim, > > > > > > > > Not yet. Still very much in the stage of gathering feedback. > > > > > > > > I would think it very simple. The biggest obstacles are > > > > > > > > 1) no documentation on how to write a data format > > > > > > > > 2) I need to release a jar for log-synth to Maven Central. > > > > > > > > > > > > > > > > > > > > On Fri, Jul 10, 2015 at 8:17 AM, Abdel Hakim Deneche < > > > > adene...@maprtech.com> > > > > wrote: > > > > > > > > > @Ted, the log-synth storage format would be really useful. I'm > > already > > > > > seeing many unit tests that could benefit from this. Do you have a > > > github > > > > > repo for your ongoing work ? > > > > > > > > > > Thanks! > > > > > > > > > > On Thu, Jul 9, 2015 at 10:56 PM, Ted Dunning < > ted.dunn...@gmail.com> > > > > > wrote: > > > > > > > > > > > Are you hard set on using common table expressions? > > > > > > > > > > > > I have discussed a bit off-list creating a data format that would > > > allow > > > > > > tables to be read from a log-synth [1] schema. That would let > you > > > read > > > > > as > > > > > > much data as you might like with an arbitrarily complex (or > simple) > > > > > query. > > > > > > > > > > > > Operationally, you would create a file containing a log-synth > > schema > > > > that > > > > > > has the extension .synth. Your data source would have to be > > > configured > > > > > to > > > > > > connect that extension with the log-synth format. At that point, > > you > > > > > could > > > > > > select as much or little data as you like from the file and you > > would > > > > see > > > > > > generated data rather than the schema. > > > > > > > > > > > > > > > > > > > > > > > > [1] https://github.com/tdunning/log-synth > > > > > > > > > > > > On Thu, Jul 9, 2015 at 11:31 AM, Alexander Zarei < > > > > > > alexanderz.si...@gmail.com > > > > > > > wrote: > > > > > > > > > > > > > Hi All, > > > > > > > > > > > > > > I am trying to come up with a query which returns a given > number > > of > > > > > rows > > > > > > > without having a real table on Storage. > > > > > > > > > > > > > > I am hoping to achieve something like this: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://stackoverflow.com/questions/6533524/sql-select-n-records-without-a-table > > > > > > > > > > > > > > DECLARE @start INT = 1;DECLARE @end INT = 1000000; > > > > > > > WITH numbers AS ( > > > > > > > SELECT @start AS number > > > > > > > UNION ALL > > > > > > > SELECT number + 1 > > > > > > > FROM numbers > > > > > > > WHERE number < @end)SELECT *FROM numbersOPTION > (MAXRECURSION > > > 0); > > > > > > > > > > > > > > I do not actually need to create different values and returning > > > > > identical > > > > > > > rows would work too.I just need to bypass the "from clause" in > > the > > > > > query. > > > > > > > > > > > > > > Thanks, > > > > > > > Alex > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Abdelhakim Deneche > > > > > > > > > > Software Engineer > > > > > > > > > > <http://www.mapr.com/> > > > > > > > > > > > > > > > Now Available - Free Hadoop On-Demand Training > > > > > < > > > > > > > > > > > > > > > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Abdelhakim Deneche > > > > > > Software Engineer > > > > > > <http://www.mapr.com/> > > > > > > > > > Now Available - Free Hadoop On-Demand Training > > > < > > > > > > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > > > > > > > > >