+1 Yes, I think SDK should provide local dictionary support also. Regards, Jacky
> 在 2018年11月5日,下午2:14,manish gupta <[email protected]> 写道: > > Hi Dev > > Currently we are supporting LOCAL DICTIONARY feature during data load > operation. The feature is very helpful in terms that it reduces the store > size which helps is reducing the IO thereby enhancing the query performance. > *This proposal is to extend LOCAL DICTIONARY feature and provide a separate > DDL and offline support for this feature. This is will make this feature > usage more flexible. The reason for proposing this feature is*: > > 1. DDL support which can enable stores without local dictionary to add this > feature for the already loaded data. This can be helpful for customers to > leverage the functionality of LOCAL DICTIONARY feature for their data > which is written in carbondata format without local dictionary. > 2. We know that when Local dictionary is enabled, though small but there is > degrade in data load performance. So there can be applications/customers > who want to fine tune the loaded data in off-peak time. This feature can be > helpful for those kind of scenarios. > 3. Offline support is proposed for SDK like features where In we do not > have spark driver executor model and there can be only a single thread used > for loading data. So for this scenario we can provide an offline support > thereby not impacting the existing data load performance. > > Please let me know your suggestions for this proposal. If most of the > community members feel the idea is good and it will make the usage of this > feature more flexible I can come up with a design and further discuss on > this platform. > > Regards > Manish Gupta >
