Re: describe query support? (catalog metadata, etc)

Divya Gehlot Mon, 23 Oct 2017 23:04:04 -0700

@Charles : when you read csv files can casting and get the data type as per
your requirements


Hope I answer what you asked for :)

HTH
Thanks,
Divya

On 20 October 2017 at 01:42, Charles Givre <[email protected]> wrote:

> Hi Alfredo,
> When I was trying to get Drill to work with various BI tools, all I really
> needed was a list of columns.  Data types would be a big bonus, but Drill
> interprets CSV data as plain text anyway.  It would be really useful for
> other file types where Drill does infer data types.
>
>
> — C
>
>
> > On Oct 19, 2017, at 6:13 AM, Alfredo Serafini <[email protected]> wrote:
> >
> > Hi thanks for the replies!
> >
> > @Chun yes using Views is an approach I considered, and I like it also
> > methodologically, in order to have some change to "prepare" the data
> > just a bit. I'm testing drill as a sort of data facade for tools which
> > handles mappings to other context, so this could be helpful for me.
> >
> > Anyway I have some concerns regardings metadata/catalog support for
> > views too: it seems that every view is saved on disk as a JSON file,
> > then experimenting the same issues. Are you suggesting saving views to
> > some kind of relational database storage for staging purposes? Is that
> > possible?
> >
> > Sorry for all the questions :-)
> >
> >
> > @Charles yes Metabase (or Tableau, Superset, and so on...) is another
> > use case in which it would be great to connect them to explore data
> > with the capabilities of drill, and even for an initial exploration of
> > data since sometimes reducing the initial analysis phase time could
> > help with development.
> >
> > For CSV it would be possible IMHO to guess types in a very basic way,
> > at least using basic types and map columns to a text/String when a
> > type can't be inferreed. It could be a starting point, and probably
> > the more confortable case where to start for the (partial) support of
> > catalog informations (JSON would be more complex, just to say). If
> > there are standard interfaces that can be extended/implemented for
> > filling them with those informations I'd like to do some
> > experimentation on that, if it's not too complex to follow, and if
> > someone can point me to a good place where to start for doing some
> > experiments of a possible implementation, for the CSV case.
> >
> > Thanks for the comments, I appreciate them
> >
> > Alfredo
> >
> >
> >
> > I’d like to second Alfredo’s request.  I’ve been trying to get Drill
> > to work with some
> >> open source visualization tools such as SqlPad and Metabase and the
> issue I keep running into
> >> is that Drill doesn’t have a convenient way to describe how it
> interprets flat files.  This
> >> is really frustrating for me since this is my main use of Drill!
> >> I wish the SELECT * FROM <data> LIMIT 0 worked in the RESTFul
> interface.  In any event,
> >> would be very useful to have some way to get Drill to describe how it
> will interpret a flat
> >> file.
> >> — C
> >
> >
> >
> >> On Oct 18, 2017, at 15:20, Chun Chang <[email protected]> wrote:
> >>
> >> There were discussions on the need of building a catalog for drill. But
> I don't think
> > that's the focus right now. And I am not sure the community will ever
> > decide to go in that
> > direction. For now, you best bet is to create views on top of your
> > JSON/CSV data.
> >>
> >> ________________________________
> >> From: Alfredo Serafini <[email protected]>
> >> Sent: Wednesday, October 18, 2017 8:31:15 AM
> >> To: [email protected]
> >> Subject: describe query support? (catalog metadata, etc)
> >>
> >> Hi I'm experimenting using Drill as a data virtualization component via
> >> JDBC and it generally works great for my needs.
> >>
> >> However some of the components connected via JDBC needs basic
> >> metadata/catalog informations, and they seems to be missing for JSON /
> CSV
> >> sources.
> >>
> >> For example the simple query
> >>
> >> DESCRIBE cp.`employee.json`;
> >>
> >> returns no results.
> >>
> >> Another possible example case could be when reading from an sqlite
> source
> >> containing the same data on an `employees` table
> >> DESCRIBE `emploees`
> >>
> >> and still get no information: while this command is not directly
> supported
> >> in SQLite, an equivalent one could be for instance:
> >> PRAGMA table_info(`employees`);
> >>
> >> but trying to execute it in Drill is not possible, as it is beyond the
> >> supported standard SQL dialect.
> >>
> >> Moreover using a query like:
> >> SELECT *
> >> FROM INFORMATION_SCHEMA.COLUMNS
> >> WHERE (TABLE_NAME='employees_view');
> >>
> >> on a view from the same data, seems to return the informations, so I
> >> suppose there should be a way to pass those informations to an
> >> internal *DatabaseMetaData
> >> <https://docs.oracle.com/javase/8/docs/api/java/sql/
> DatabaseMetaData.html>*
> >> implementation.
> >> I wonder if there is such a component designed to manage all the catalog
> >> informations for different sources?
> >>
> >> In this case it could adopt different strategies for retrieving
> metadata,
> >> depending on the case: for sqlite a different command / dialect could be
> >> used, for CSV types could be guessed using simple heuristics, and so on.
> >> Probably cases like JSON would be much more complex, anyway.
> >> Once the metadata have been retrieved for a source, I suppose the
> standard
> >> SQL dialect should work as expected.
> >>
> >>
> >> Are there any plans to add catalog metadata support for various sources?
> >> Does anybody have some workaround? for example using views or similar
> >> approaches?
> >>
> >>
> >> thanks in advance, sorry if the message is too long :-)
> >> Alfredo
>
>

Re: describe query support? (catalog metadata, etc)

Reply via email to