I created a JIRA for this based on the Hangout today! https://issues.apache.org/jira/browse/DRILL-5461
On Mon, Mar 6, 2017 at 7:55 AM, John Omernik <[email protected]> wrote: > I can see both sides. But Ted is right, this won't hurt any thing from a > performance perspective, even if they put War and Peace in there 30 times, > that's 100mb of information to serve. People may choose to use formatting > languages like Markup or something. I do think we should have a limit so we > know what happens if someone tries to break that limit (from a security > perspective) but we could set that quite high, and then just test putting > data that exceeds that as a unit test. > > > > On Fri, Mar 3, 2017 at 8:28 PM, Ted Dunning <[email protected]> wrote: > >> All of War and Peace is only 3MB. >> >> Let people document however they want. Don't over-optimize for problems >> that have never occurred. >> >> >> >> On Fri, Mar 3, 2017 at 3:19 PM, Kunal Khatua <[email protected]> wrote: >> >> > It might be, incase someone begins to dump a massive design doc into the >> > comment field for a view's JSON. >> > >> > >> > I'm also not sure about how this information can be consumed. If it is >> > through CLI, either we rely on the SQLLine shell to trim the output, or >> not >> > worry at all. I'm assuming we'd also probably want something like a >> > >> > DESCRIBE VIEW ... >> > >> > to be enhanced to something like >> > >> > DESCRIBE VIEW WITH COMMENTARY ... >> > >> > >> > A 1KB field is quite generous IMHO. That's more than 7 tweets to >> describe >> > something ! [?] >> > >> > >> > Kunal Khatua >> > >> > ________________________________ >> > From: Ted Dunning <[email protected]> >> > Sent: Friday, March 3, 2017 12:56:44 PM >> > To: user >> > Subject: Re: Discussion: Comments in Drill Views >> > >> > It it really necessary to put a technical limit in to prevent people >> from >> > OVER-documenting views? >> > >> > >> > What is the last time you saw code that had too many comments in it? >> > >> > >> > >> > On Thu, Mar 2, 2017 at 8:42 AM, John Omernik <[email protected]> wrote: >> > >> > > So I think on your worry that's an easily definable "abuse" >> condition... >> > > i.e. if we set a limit of say 1024 characters, that provides ample >> space >> > > for descriptions, but at 1kb per view, that's an allowable condition, >> > i.e. >> > > it would be hard to abuse it ... or am I missing something? >> > > >> > > On Wed, Mar 1, 2017 at 8:08 PM, Kunal Khatua <[email protected]> >> wrote: >> > > >> > > > +1 >> > > > >> > > > >> > > > I this this can be very useful. The only worry is of someone abusing >> > it, >> > > > so we probably should have a limit on the size of this? Not sure >> else >> > it >> > > > could be exposed and consumed. >> > > > >> > > > >> > > > Kunal Khatua >> > > > >> > > > Engineering >> > > > >> > > > [MapR]<http://www.mapr.com/> >> > > > >> > > > www.mapr.com<http://www.mapr.com/> >> > > > >> > > > ________________________________ >> > > > From: John Omernik <[email protected]> >> > > > Sent: Wednesday, March 1, 2017 9:55:27 AM >> > > > To: user >> > > > Subject: Re: Discussion: Comments in Drill Views >> > > > >> > > > Sorry, I let this idea drop (I didn't follow up and found when >> > searching >> > > > for something else...) Any other thoughts on this idea? >> > > > >> > > > Should I open a JIRA if people think it would be handy? >> > > > >> > > > On Thu, Jun 23, 2016 at 4:02 PM, Ted Dunning <[email protected] >> > >> > > > wrote: >> > > > >> > > > > This is very interesting. I love docstrings in Lisp and Python >> and >> > > > Javadoc >> > > > > in Java. >> > > > > >> > > > > Basically this is like that, but for SQL. Very helpful. >> > > > > >> > > > > On Thu, Jun 23, 2016 at 11:48 AM, John Omernik <[email protected]> >> > > wrote: >> > > > > >> > > > > > I am looking for discussion here. A colleague was asking me how >> to >> > > add >> > > > > > comments to the metadata of a view. (He's new to Drill, thus >> the >> > > idea >> > > > of >> > > > > > not having metadata for a table is one he's warming up to). >> > > > > > >> > > > > > That got me thinking... why couldn't we use Drill Views to store >> > > > > > table/field comments? This could be a great way to help add >> > > contextual >> > > > > > information for users. Here's some current observations when I >> > issue >> > > a >> > > > > > describe view_myview >> > > > > > >> > > > > > >> > > > > > 1. I get three columns ,COLUMN_NAME, DATA_TYPE, and IS_NULLABLE >> > > > > > 2. Even thought the underlying parquet table has types, the view >> > does >> > > > not >> > > > > > pass the types for the underlying parquet files through. (The >> type >> > > is >> > > > > ANY) >> > > > > > 3. The data for the view is all just a json file that could be >> > easily >> > > > > > extended. >> > > > > > >> > > > > > >> > > > > > So, a few things would be a nice to have >> > > > > > >> > > > > > 1. Table comments. when I issue a describe table, if the view >> has >> > a >> > > > > > "Description" field, then having that print out as a description >> > for >> > > > the >> > > > > > whole view would be nice. This is harder, I think because it's >> not >> > > > just >> > > > > > extending the view information. >> > > > > > >> > > > > > 2. Column comments: A text field that could be added to the >> view, >> > > and >> > > > > just >> > > > > > print out another column with description. This would be very >> > > helpful. >> > > > > > While Drill being schemaless is awesome, the ability to add >> > > information >> > > > > to >> > > > > > known data, is huge. >> > > > > > >> > > > > > 3. Ability to to use the types from the Parquet files (without >> > > manually >> > > > > > specifying each type). If we could provide an option to View >> > > creation >> > > > to >> > > > > > attempt to infer type, that would be handy. I realize that folks >> > are >> > > > > using >> > > > > > the LIMIT 0 to get metadata, but describe could be done well >> too. >> > > > > > >> > > > > > 4. Ability, using ANSI Sql to update the view column >> descriptions >> > and >> > > > the >> > > > > > description for the view itself. >> > > > > > >> > > > > > 5. I believe Avro has the ability to add this information to the >> > > files, >> > > > > so >> > > > > > if the data exists outside of views (such as in AVRO files) >> should >> > we >> > > > > > present it to the user in describe table events as well? >> > > > > > >> > > > > > Curious if folks think this would be valuable, how much work an >> > > > addition >> > > > > > like this would be to Drill, and other thoughts in general. >> > > > > > >> > > > > > >> > > > > > John >> > > > > > >> > > > > >> > > > >> > > >> > >> > >
