Re: Dynamic Tables in Calcite

Ravikumar CS Thu, 18 Aug 2016 16:58:25 -0700

Hi,

I saw Drill is able to bypass sql validation on table & run queries on
arbitrary paths(as tables).
This is exactly what I need. Can someone from drill be able to comment on
this ?


*Example:* select * from dfs.some_path_to_json_file.json;

~Ravi

On Wed, Aug 17, 2016 at 8:42 AM, Ravikumar CS <[email protected]>
wrote:

> Thanks Julian for your feedback. I will put together some code & tests.
>
> I created a new CsvDynamicSchemaFactory which returns a
> CsvDynamicSchema(based on CSV Example)
> CsvDynamicSchema will hold a map which contains tableName -> Table mapping.
> Overridden getTable(String name) will create the tables for the first time
> & cache it in map. However, while
> running the queries they fail during validation at getTableNames()
>
> Is there a way to disable/configuration which we can set so that
> getTableNames() is not called. Because all
> these table names are arbitrary file paths and they are not known before
> hand. At least till the first time they
> are encountered first time after that they are in the map.
>
> ~Ravi
>
>
> On Wed, Aug 17, 2016 at 2:10 AM, Julian Hyde <[email protected]> wrote:
>
>> Yes, Csv tables are a good place to start. Not sure whether we want to
>> accept extensions into the example/csv module, because it is intended as an
>> example. But do what you need to, create tests, create a pull request, and
>> let’s see what we can do with it. Maybe it could be combined into
>> https://issues.apache.org/jira/browse/CALCITE-884 <
>> https://issues.apache.org/jira/browse/CALCITE-884>, if that work ever
>> gets finished.
>>
>> CsvSchema.createTable creates a File object (based on the table’s name)
>> which is then stored in a CsvTable. I think that’s a good model for you to
>> follow, whether or not you build upon the Csv adapter. I was mistaken in
>> saying that you could share a single Table object among multiple files.
>> There is simply not enough context passed into the
>> "ScannableTable.scan(DataContext)” method for a table to do the right
>> thing if the table does not know what file it is reading.
>>
>> Julian
>>
>>
>> [1] https://en.wikipedia.org/wiki/Inode <https://en.wikipedia.org/wiki
>> /Inode>
>>
>> > On Aug 16, 2016, at 12:57 AM, Ravikumar CS <[email protected]>
>> wrote:
>> >
>> > Hi Julian,
>> >
>> >   Thanks for your reply.
>> >
>> >   Could you elaborate a bit when you say, "When that Table object is
>> used
>> > (e.g. when it is
>> >   wrapped in a RelOptTableImpl), Calcite will supply the path." ?
>> >
>> >   I am currently looking at the CSVTable schema[1] implementation & see
>> > how that could be
>> >   enhanced to dynamically accept csv files. Will this logic reside in
>> the
>> > SchemaFactory & how
>> >   should RelOptTableImpl be used.
>> >
>> >   If this is useful to larger audience, I am happy to provide
>> > documentation.
>> >
>> > ~Ravi
>> >
>> > [1] Calcite Schema:
>> >    Schema:
>> > https://github.com/apache/calcite/blob/master/example/csv/
>> src/main/java/org/apache/calcite/adapter/csv/CsvSchema.java
>> >    SchemaFactory:
>> > https://github.com/apache/calcite/blob/master/example/csv/
>> src/main/java/org/apache/calcite/adapter/csv/CsvSchemaFactory.java
>> >
>> > On Sat, Aug 13, 2016 at 12:21 AM, Ravikumar CS <[email protected]>
>> > wrote:
>> >
>> >> Thanks Julian. Is there an example that I can look at ?
>> >>
>> >> ~Ravi
>> >>
>> >> On Fri, Aug 12, 2016 at 11:47 PM, Julian Hyde <[email protected]>
>> wrote:
>> >>
>> >>> Yes, this is possible. Your implementation of Schema.getTable(String
>> >>> name) should always return the same Table object. When that Table
>> object is
>> >>> used (e.g. when it is wrapped in a RelOptTableImpl), Calcite will
>> supply
>> >>> the path.
>> >>>
>> >>> Julian
>> >>>
>> >>>> On Aug 13, 2016, at 1:00 AM, Ravikumar CS <[email protected]>
>> >>> wrote:
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> Is it possible to query dynamic tables within a table schema in
>> Calcite
>> >>> ?
>> >>>> That is the table name is coming as part of the SQL(and is changing)
>> >>>> however they all map to the same calcite table implementation(say
>> >>>> JSONTable) ?
>> >>>>
>> >>>> Any pointers on how this could be achieved.
>> >>>>
>> >>>> *Example:*
>> >>>>
>> >>>> *1.* select * from foo_schema."/data/foo.json"
>> >>>>
>> >>>> *2. *select * from foo_schema."/data/bar.json"
>> >>>>
>> >>>> *3.* select * from foo_schema."/data/baz.json"
>> >>>
>> >>>
>> >>
>>
>>
>

Re: Dynamic Tables in Calcite

Reply via email to