@Charles : when you read csv files can casting and get the data type as per your requirements
Hope I answer what you asked for :) HTH Thanks, Divya On 20 October 2017 at 01:42, Charles Givre <[email protected]> wrote: > Hi Alfredo, > When I was trying to get Drill to work with various BI tools, all I really > needed was a list of columns. Data types would be a big bonus, but Drill > interprets CSV data as plain text anyway. It would be really useful for > other file types where Drill does infer data types. > > > — C > > > > On Oct 19, 2017, at 6:13 AM, Alfredo Serafini <[email protected]> wrote: > > > > Hi thanks for the replies! > > > > @Chun yes using Views is an approach I considered, and I like it also > > methodologically, in order to have some change to "prepare" the data > > just a bit. I'm testing drill as a sort of data facade for tools which > > handles mappings to other context, so this could be helpful for me. > > > > Anyway I have some concerns regardings metadata/catalog support for > > views too: it seems that every view is saved on disk as a JSON file, > > then experimenting the same issues. Are you suggesting saving views to > > some kind of relational database storage for staging purposes? Is that > > possible? > > > > Sorry for all the questions :-) > > > > > > @Charles yes Metabase (or Tableau, Superset, and so on...) is another > > use case in which it would be great to connect them to explore data > > with the capabilities of drill, and even for an initial exploration of > > data since sometimes reducing the initial analysis phase time could > > help with development. > > > > For CSV it would be possible IMHO to guess types in a very basic way, > > at least using basic types and map columns to a text/String when a > > type can't be inferreed. It could be a starting point, and probably > > the more confortable case where to start for the (partial) support of > > catalog informations (JSON would be more complex, just to say). If > > there are standard interfaces that can be extended/implemented for > > filling them with those informations I'd like to do some > > experimentation on that, if it's not too complex to follow, and if > > someone can point me to a good place where to start for doing some > > experiments of a possible implementation, for the CSV case. > > > > Thanks for the comments, I appreciate them > > > > Alfredo > > > > > > > > I’d like to second Alfredo’s request. I’ve been trying to get Drill > > to work with some > >> open source visualization tools such as SqlPad and Metabase and the > issue I keep running into > >> is that Drill doesn’t have a convenient way to describe how it > interprets flat files. This > >> is really frustrating for me since this is my main use of Drill! > >> I wish the SELECT * FROM <data> LIMIT 0 worked in the RESTFul > interface. In any event, > >> would be very useful to have some way to get Drill to describe how it > will interpret a flat > >> file. > >> — C > > > > > > > >> On Oct 18, 2017, at 15:20, Chun Chang <[email protected]> wrote: > >> > >> There were discussions on the need of building a catalog for drill. But > I don't think > > that's the focus right now. And I am not sure the community will ever > > decide to go in that > > direction. For now, you best bet is to create views on top of your > > JSON/CSV data. > >> > >> ________________________________ > >> From: Alfredo Serafini <[email protected]> > >> Sent: Wednesday, October 18, 2017 8:31:15 AM > >> To: [email protected] > >> Subject: describe query support? (catalog metadata, etc) > >> > >> Hi I'm experimenting using Drill as a data virtualization component via > >> JDBC and it generally works great for my needs. > >> > >> However some of the components connected via JDBC needs basic > >> metadata/catalog informations, and they seems to be missing for JSON / > CSV > >> sources. > >> > >> For example the simple query > >> > >> DESCRIBE cp.`employee.json`; > >> > >> returns no results. > >> > >> Another possible example case could be when reading from an sqlite > source > >> containing the same data on an `employees` table > >> DESCRIBE `emploees` > >> > >> and still get no information: while this command is not directly > supported > >> in SQLite, an equivalent one could be for instance: > >> PRAGMA table_info(`employees`); > >> > >> but trying to execute it in Drill is not possible, as it is beyond the > >> supported standard SQL dialect. > >> > >> Moreover using a query like: > >> SELECT * > >> FROM INFORMATION_SCHEMA.COLUMNS > >> WHERE (TABLE_NAME='employees_view'); > >> > >> on a view from the same data, seems to return the informations, so I > >> suppose there should be a way to pass those informations to an > >> internal *DatabaseMetaData > >> <https://docs.oracle.com/javase/8/docs/api/java/sql/ > DatabaseMetaData.html>* > >> implementation. > >> I wonder if there is such a component designed to manage all the catalog > >> informations for different sources? > >> > >> In this case it could adopt different strategies for retrieving > metadata, > >> depending on the case: for sqlite a different command / dialect could be > >> used, for CSV types could be guessed using simple heuristics, and so on. > >> Probably cases like JSON would be much more complex, anyway. > >> Once the metadata have been retrieved for a source, I suppose the > standard > >> SQL dialect should work as expected. > >> > >> > >> Are there any plans to add catalog metadata support for various sources? > >> Does anybody have some workaround? for example using views or similar > >> approaches? > >> > >> > >> thanks in advance, sorry if the message is too long :-) > >> Alfredo > >
