I created  a JIRA for this based on the Hangout today!

https://issues.apache.org/jira/browse/DRILL-5461



On Mon, Mar 6, 2017 at 7:55 AM, John Omernik <[email protected]> wrote:

> I can see both sides. But Ted is right, this won't hurt any thing from a
> performance perspective, even if they put War and Peace in there 30 times,
> that's 100mb of information to serve. People may choose to use formatting
> languages like Markup or something. I do think we should have a limit so we
> know what happens if someone tries to break that limit (from a security
> perspective) but we could set that quite high, and then just test putting
> data that exceeds that as a unit test.
>
>
>
> On Fri, Mar 3, 2017 at 8:28 PM, Ted Dunning <[email protected]> wrote:
>
>> All of War and Peace is only 3MB.
>>
>> Let people document however they want. Don't over-optimize for problems
>> that have never occurred.
>>
>>
>>
>> On Fri, Mar 3, 2017 at 3:19 PM, Kunal Khatua <[email protected]> wrote:
>>
>> > It might be, incase someone begins to dump a massive design doc into the
>> > comment field for a view's JSON.
>> >
>> >
>> > I'm also not sure about how this information can be consumed. If it is
>> > through CLI, either we rely on the SQLLine shell to trim the output, or
>> not
>> > worry at all. I'm assuming we'd also probably want something like a
>> >
>> > DESCRIBE VIEW ...
>> >
>> > to be enhanced to something like
>> >
>> > DESCRIBE VIEW WITH COMMENTARY ...
>> >
>> >
>> > A 1KB field is quite generous IMHO. That's more than 7 tweets to
>> describe
>> > something ! [?]
>> >
>> >
>> > Kunal Khatua
>> >
>> > ________________________________
>> > From: Ted Dunning <[email protected]>
>> > Sent: Friday, March 3, 2017 12:56:44 PM
>> > To: user
>> > Subject: Re: Discussion: Comments in Drill Views
>> >
>> > It it really necessary to put a technical limit in to prevent people
>> from
>> > OVER-documenting views?
>> >
>> >
>> > What is the last time you saw code that had too many comments in it?
>> >
>> >
>> >
>> > On Thu, Mar 2, 2017 at 8:42 AM, John Omernik <[email protected]> wrote:
>> >
>> > > So I think on your worry that's an easily definable "abuse"
>> condition...
>> > > i.e. if we set a limit of say 1024 characters, that provides ample
>> space
>> > > for descriptions, but at 1kb per view, that's an allowable condition,
>> > i.e.
>> > > it would be hard to abuse it ... or am I missing something?
>> > >
>> > > On Wed, Mar 1, 2017 at 8:08 PM, Kunal Khatua <[email protected]>
>> wrote:
>> > >
>> > > > +1
>> > > >
>> > > >
>> > > > I this this can be very useful. The only worry is of someone abusing
>> > it,
>> > > > so we probably should have a limit on the size of this? Not sure
>> else
>> > it
>> > > > could be exposed and consumed.
>> > > >
>> > > >
>> > > > Kunal Khatua
>> > > >
>> > > > Engineering
>> > > >
>> > > > [MapR]<http://www.mapr.com/>
>> > > >
>> > > > www.mapr.com<http://www.mapr.com/>
>> > > >
>> > > > ________________________________
>> > > > From: John Omernik <[email protected]>
>> > > > Sent: Wednesday, March 1, 2017 9:55:27 AM
>> > > > To: user
>> > > > Subject: Re: Discussion: Comments in Drill Views
>> > > >
>> > > > Sorry, I let this idea drop (I didn't follow up and found when
>> > searching
>> > > > for something else...)  Any other thoughts on this idea?
>> > > >
>> > > > Should I open a JIRA if people think it would be handy?
>> > > >
>> > > > On Thu, Jun 23, 2016 at 4:02 PM, Ted Dunning <[email protected]
>> >
>> > > > wrote:
>> > > >
>> > > > > This is very interesting.  I love docstrings in Lisp and Python
>> and
>> > > > Javadoc
>> > > > > in Java.
>> > > > >
>> > > > > Basically this is like that, but for SQL. Very helpful.
>> > > > >
>> > > > > On Thu, Jun 23, 2016 at 11:48 AM, John Omernik <[email protected]>
>> > > wrote:
>> > > > >
>> > > > > > I am looking for discussion here. A colleague was asking me how
>> to
>> > > add
>> > > > > > comments to the metadata of a view.  (He's new to Drill, thus
>> the
>> > > idea
>> > > > of
>> > > > > > not having metadata for a table is one he's warming up to).
>> > > > > >
>> > > > > > That got me thinking... why couldn't we use Drill Views to store
>> > > > > > table/field comments?  This could be a great way to help add
>> > > contextual
>> > > > > > information for users. Here's some current observations when I
>> > issue
>> > > a
>> > > > > > describe view_myview
>> > > > > >
>> > > > > >
>> > > > > > 1. I get three columns ,COLUMN_NAME, DATA_TYPE, and IS_NULLABLE
>> > > > > > 2. Even thought the underlying parquet table has types, the view
>> > does
>> > > > not
>> > > > > > pass the types for the underlying parquet files through.  (The
>> type
>> > > is
>> > > > > ANY)
>> > > > > > 3. The data for the view is all just a json file that could be
>> > easily
>> > > > > > extended.
>> > > > > >
>> > > > > >
>> > > > > > So, a few things would be a nice to have
>> > > > > >
>> > > > > > 1. Table comments.  when I issue a describe table, if the view
>> has
>> > a
>> > > > > > "Description" field, then having that print out as a description
>> > for
>> > > > the
>> > > > > > whole view would be nice.  This is harder, I think because it's
>> not
>> > > > just
>> > > > > > extending the view information.
>> > > > > >
>> > > > > > 2. Column comments:  A text field that could be added to the
>> view,
>> > > and
>> > > > > just
>> > > > > > print out another column with description.  This would be very
>> > > helpful.
>> > > > > > While Drill being schemaless is awesome, the ability to add
>> > > information
>> > > > > to
>> > > > > > known data, is huge.
>> > > > > >
>> > > > > > 3. Ability to to use the types from the Parquet files (without
>> > > manually
>> > > > > > specifying each type).  If we could provide an option to View
>> > > creation
>> > > > to
>> > > > > > attempt to infer type, that would be handy. I realize that folks
>> > are
>> > > > > using
>> > > > > > the LIMIT 0 to get metadata, but describe could be done well
>> too.
>> > > > > >
>> > > > > > 4. Ability, using ANSI Sql to update the view column
>> descriptions
>> > and
>> > > > the
>> > > > > > description for the view itself.
>> > > > > >
>> > > > > > 5. I believe Avro has the ability to add this information to the
>> > > files,
>> > > > > so
>> > > > > > if the data exists outside of views (such as in AVRO files)
>> should
>> > we
>> > > > > > present it to the user in describe table events as well?
>> > > > > >
>> > > > > > Curious if folks think this would be valuable, how much work an
>> > > > addition
>> > > > > > like this would be to Drill, and other thoughts in general.
>> > > > > >
>> > > > > >
>> > > > > > John
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Reply via email to