Re: [Discussion] Carbon Local Dictionary Support

2018-07-12 Thread xuchuanyin
Hi, kumarvishal: As the local dictionary feature will be released in 1.4.1, Is there any difference between the implementation and the previous design document? I'm trying to understand the implementation of local dictionary. If there is any difference, please help to update the document in

Re: [Discussion] Carbon Local Dictionary Support

2018-06-14 Thread xm_zzc
Hi kumarvishal09: Will this feature support on stream table too? -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Discussion] Carbon Local Dictionary Support

2018-06-08 Thread akashrn5
1. If user is giving any invalid value, default threshold(1000 unique values) value will be considered.  What is the consideration behind the default value 1000. *1000 is a random value we have mentioned in design doc. CARBON_LOCALDICT_THRESHOLD is exposed to user for setting threshold

Re: [Discussion] Carbon Local Dictionary Support

2018-06-08 Thread akashrn5
1. If user is giving any invalid value, default threshold(1000 unique values) value will be considered.  What is the consideration behind the default value 1000. *1000 is a random value we have mentioned in design doc. CARBON_LOCALDICT_THRESHOLD is exposed to user for setting threshold

Re: [Discussion] Carbon Local Dictionary Support

2018-06-08 Thread chetdb
Dear Vishal, Please find the queries/comments on the design doc. 1. If user is giving any invalid value, default threshold(1000 unique values) value will be considered.  What is the consideration behind the default value 1000. 2. There is no option mentioned for the user to alter the

Re: [Discussion] Carbon Local Dictionary Support

2018-06-08 Thread akashrn5
Hi bhavya, Local dictionary generation is task level. if in ongoing load, if the threshold is breached, then for that load the local dictionary will not be generated for that corresponding column and there is no dependency with the previous loads. For each load new local dictionary will be

Re: [Discussion] Carbon Local Dictionary Support

2018-06-08 Thread akashrn5
Hi xuchuanyin, Please find my comments inline About query filtering 1. “during filter, actual filter values will be generated using column local dictionary values...then filter will be applied on the dictionary encode data” --- If the filter is not 'equal' but 'like','greater than', can it

Re: [Discussion] Carbon Local Dictionary Support

2018-06-07 Thread Bhavya Aggarwal
Hi Vishal, Thanks for sharing the design and I have one question related to deciding on whether to generate the dictionary or not. If in first few loads we have the cardinality below the threshold then we will create a local dictionary, but if in subsequent loads the threshold value is breached

Re: [Discussion] Carbon Local Dictionary Support

2018-06-07 Thread xuchuanyin
About query filtering 1. “during filter, actual filter values will be generated using column local dictionary values...then filter will be applied on the dictionary encode data” --- If the filter is not 'equal' but 'like','greater than', can it also run on encode data. 2. "As dictionary data

Re: [Discussion] Carbon Local Dictionary Support

2018-06-07 Thread manish gupta
Hi Vishal, Thanks for uploading the design document. The document is good and gives a detailed picture of the requirement. I have few questions and suggestions. Kindly consider if applicable. 1. Will the local dictionary be read once and put into offheap/onheap memory or for every query it will

Re: [Discussion] Carbon Local Dictionary Support

2018-06-06 Thread Kumar Vishal
Hi Xuchuanyin, Please find the JIRA link for local dictionary support. https://issues.apache.org/jira/browse/CARBONDATA-2584 -Regards Kumar Vishal On Wed, Jun 6, 2018 at 6:25 PM, xuchuanyin wrote: > Hi, Kumar: > Can you raise a Jira and provide the document as attachment? I cannot > open

Re: [Discussion] Carbon Local Dictionary Support

2018-06-06 Thread xuchuanyin
Hi, Kumar: Can you raise a Jira and provide the document as attachment? I cannot open the links since it is blocked.

Re: [Discussion] Carbon Local Dictionary Support

2018-06-06 Thread Kumar Vishal
Hi All, Please ignore above link. Please comment here: https://docs.google.com/document/d/1y0dJSWOr0ZTPpbNOOUfVfU5SoANL5B1F0l7jhl8BgUs/edit?usp=sharing -Regards Kumar Vishal On Wed, Jun 6, 2018 at 3:06 PM, Kumar Vishal wrote: > Hi All, > > Due to some problem above link is not working.

Re: [Discussion] Carbon Local Dictionary Support

2018-06-06 Thread Kumar Vishal
Hi All, Due to some problem above link is not working. Please find the updated link. https://drive.google.com/file/d/10LqtQlrE4jeotmleoMLJ8F91rK2TrN2h/view?usp=sharing -Regards Kumar Vishal On Wed, Jun 6, 2018 at 2:40 PM, Kumar Vishal wrote: > Hi All, > > Please find the link for design doc.

Re: [Discussion] Carbon Local Dictionary Support

2018-06-06 Thread Kumar Vishal
Hi All, Please find the link for design doc. https://drive.google.com/file/d/1eqfIms2tMi3b63nMbKfGRZYmo7TMy E1_/view?usp=sharing -Regards Kumar Vishal On Wed, Jun 6, 2018 at 2:25 PM, Kumar Vishal wrote: > Hi Community, > > Please find the Attached Local dictionary support design document.

Re: [Discussion] Carbon Local Dictionary Support

2018-06-06 Thread Kumar Vishal
Hi Community, Please find the Attached Local dictionary support design document. Please let me know for any further clarification on design document. Any further inputs/improvements are most welcomed. -Regards Kumar Vishal On Tue, Jun 5, 2018 at 6:14 PM, Jacky Li wrote: > +1 > Good feature

Re: [Discussion] Carbon Local Dictionary Support

2018-06-05 Thread Jacky Li
+1 Good feature to add in CarbonData Regards, Jacky > 在 2018年6月4日,下午11:10,Kumar Vishal 写道: > > Hi Community,Currently CarbonData supports global dictionary or > No-Dictionary (Plain-Text stored in LV format) for storing dimension column > data. > > *Bottleneck with Global Dictionary* > >

Re: [Discussion] Carbon Local Dictionary Support

2018-06-05 Thread xm_zzc
Hi: +1. This is an exciting feature, hope to have it in version 1.5. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Discussion] Carbon Local Dictionary Support

2018-06-04 Thread manish gupta
+1 It is a good feature to have. Once the design document is uploaded we will get a better idea of how it will be implemented. Regards Manish Gupta On Tue, Jun 5, 2018 at 11:18 AM, Kumar Vishal wrote: > Hi Xuchuanyin, > > I am working on design document, and all the points you have mentioned

Re: [Discussion] Carbon Local Dictionary Support

2018-06-04 Thread Kumar Vishal
Hi Xuchuanyin, I am working on design document, and all the points you have mentioned I have already captured. I will share once it is finished. -Regards Kumar Vishal On Tue, Jun 5, 2018 at 9:22 AM, xuchuanyin wrote: > Hi, Kumar: > Local dictionary will be nice feature and other formats

Re: [Discussion] Carbon Local Dictionary Support

2018-06-04 Thread Ravindra Pesala
Hi Vishal, +1 Thank you for starting a discussion on it. It will be a very helpful feature to improve query performance and reduces the memory footprint. Please add the design document for the same. Regards, Ravindra. On 5 June 2018 at 09:22, xuchuanyin wrote: > Hi, Kumar: > Local

Re: [Discussion] Carbon Local Dictionary Support

2018-06-04 Thread xuchuanyin
Hi, Kumar: Local dictionary will be nice feature and other formats like parquet all support this. My concern is that: How will you implement this feature? 1. What's the scope of the `local`? Page level (for all containing rows), Blocklet level (for all containing pages), Block level(for

[Discussion] Carbon Local Dictionary Support

2018-06-04 Thread Kumar Vishal
Hi Community,Currently CarbonData supports global dictionary or No-Dictionary (Plain-Text stored in LV format) for storing dimension column data. *Bottleneck with Global Dictionary* 1. As dictionary file is mutable file, so it is not possible to support global dictionary in storage