nt: Thursday, December 15, 2016 12:55 AM
> > To: dev@carbondata.incubator.apache.org
> > Subject: [DISCUSSION] CarbonData loading solution discussion
> >
> >
> > Hi community,
> >
> > Since CarbonData has global dictionary feature, currently when loading
>
y, December 15, 2016 12:55 AM
> To: dev@carbondata.incubator.apache.org
> Subject: [DISCUSSION] CarbonData loading solution discussion
>
>
> Hi community,
>
> Since CarbonData has global dictionary feature, currently when loading
> data to CarbonData, it requires two times of
.
Regards.
Jihong
-Original Message-
From: Jacky Li [mailto:jacky.li...@qq.com]
Sent: Thursday, December 15, 2016 12:55 AM
To: dev@carbondata.incubator.apache.org
Subject: [DISCUSSION] CarbonData loading solution discussion
Hi community,
Since CarbonData has global dictionary feature
+1We should flexibility choose loading solution according to Scenario 1 and
2, and will get performance benefits.
--
View this message in context:
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/DISCUSSION-CarbonData-loading-solution-discussion-tp4490p4520.html
Sent from
> 10K, run two jobs using two output
> formats. Otherwise, run one job that use TableOutputFormat with
> single-pass support
>
> 2) for subsequent load
> Run one job that use TableOutputFormat with single-pass support
>
> What do yo think this idea?
>
> Regards,
use TableOutputFormat with single-pass support
What do yo think this idea?
Regards,
Jacky
--
View this message in context:
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/DISCUSSION-CarbonData-loading-solution-discussion-tp4490p4491.html
Sent from the Apache CarbonD
Hi community,
Since CarbonData has global dictionary feature, currently when loading data to
CarbonData, it requires two times of scan of the input data. First scan is to
generate dictionary, second scan to do actual data encoding and write to carbon
files. Obviously, this approach is simple,