Hi, I saw Drill is able to bypass sql validation on table & run queries on arbitrary paths(as tables). This is exactly what I need. Can someone from drill be able to comment on this ?
*Example:* select * from dfs.some_path_to_json_file.json; ~Ravi On Wed, Aug 17, 2016 at 8:42 AM, Ravikumar CS <[email protected]> wrote: > Thanks Julian for your feedback. I will put together some code & tests. > > I created a new CsvDynamicSchemaFactory which returns a > CsvDynamicSchema(based on CSV Example) > CsvDynamicSchema will hold a map which contains tableName -> Table mapping. > Overridden getTable(String name) will create the tables for the first time > & cache it in map. However, while > running the queries they fail during validation at getTableNames() > > Is there a way to disable/configuration which we can set so that > getTableNames() is not called. Because all > these table names are arbitrary file paths and they are not known before > hand. At least till the first time they > are encountered first time after that they are in the map. > > ~Ravi > > > On Wed, Aug 17, 2016 at 2:10 AM, Julian Hyde <[email protected]> wrote: > >> Yes, Csv tables are a good place to start. Not sure whether we want to >> accept extensions into the example/csv module, because it is intended as an >> example. But do what you need to, create tests, create a pull request, and >> let’s see what we can do with it. Maybe it could be combined into >> https://issues.apache.org/jira/browse/CALCITE-884 < >> https://issues.apache.org/jira/browse/CALCITE-884>, if that work ever >> gets finished. >> >> CsvSchema.createTable creates a File object (based on the table’s name) >> which is then stored in a CsvTable. I think that’s a good model for you to >> follow, whether or not you build upon the Csv adapter. I was mistaken in >> saying that you could share a single Table object among multiple files. >> There is simply not enough context passed into the >> "ScannableTable.scan(DataContext)” method for a table to do the right >> thing if the table does not know what file it is reading. >> >> Julian >> >> >> [1] https://en.wikipedia.org/wiki/Inode <https://en.wikipedia.org/wiki >> /Inode> >> >> > On Aug 16, 2016, at 12:57 AM, Ravikumar CS <[email protected]> >> wrote: >> > >> > Hi Julian, >> > >> > Thanks for your reply. >> > >> > Could you elaborate a bit when you say, "When that Table object is >> used >> > (e.g. when it is >> > wrapped in a RelOptTableImpl), Calcite will supply the path." ? >> > >> > I am currently looking at the CSVTable schema[1] implementation & see >> > how that could be >> > enhanced to dynamically accept csv files. Will this logic reside in >> the >> > SchemaFactory & how >> > should RelOptTableImpl be used. >> > >> > If this is useful to larger audience, I am happy to provide >> > documentation. >> > >> > ~Ravi >> > >> > [1] Calcite Schema: >> > Schema: >> > https://github.com/apache/calcite/blob/master/example/csv/ >> src/main/java/org/apache/calcite/adapter/csv/CsvSchema.java >> > SchemaFactory: >> > https://github.com/apache/calcite/blob/master/example/csv/ >> src/main/java/org/apache/calcite/adapter/csv/CsvSchemaFactory.java >> > >> > On Sat, Aug 13, 2016 at 12:21 AM, Ravikumar CS <[email protected]> >> > wrote: >> > >> >> Thanks Julian. Is there an example that I can look at ? >> >> >> >> ~Ravi >> >> >> >> On Fri, Aug 12, 2016 at 11:47 PM, Julian Hyde <[email protected]> >> wrote: >> >> >> >>> Yes, this is possible. Your implementation of Schema.getTable(String >> >>> name) should always return the same Table object. When that Table >> object is >> >>> used (e.g. when it is wrapped in a RelOptTableImpl), Calcite will >> supply >> >>> the path. >> >>> >> >>> Julian >> >>> >> >>>> On Aug 13, 2016, at 1:00 AM, Ravikumar CS <[email protected]> >> >>> wrote: >> >>>> >> >>>> Hi, >> >>>> >> >>>> Is it possible to query dynamic tables within a table schema in >> Calcite >> >>> ? >> >>>> That is the table name is coming as part of the SQL(and is changing) >> >>>> however they all map to the same calcite table implementation(say >> >>>> JSONTable) ? >> >>>> >> >>>> Any pointers on how this could be achieved. >> >>>> >> >>>> *Example:* >> >>>> >> >>>> *1.* select * from foo_schema."/data/foo.json" >> >>>> >> >>>> *2. *select * from foo_schema."/data/bar.json" >> >>>> >> >>>> *3.* select * from foo_schema."/data/baz.json" >> >>> >> >>> >> >> >> >> >
