Re: XOR encoding for floating point

2017-07-10 Thread Geetika Gupta
Hi Jacky Li,

XOR Encoding mainly works on timeseries data as discussed in the paper. We
looked into the classes suggested by you and found out that we will be
having min and max values for our data, firstly we need to identify whether
the data is in time series or not only then XOR encoding can be successful.

So do we need to check for timeseries data prior to performing the encoding.

-- 
Regards,
Geetika Gupta

On Thu, Jul 6, 2017 at 6:15 AM, Jacky Li  wrote:

> +1
>
> Feel free to contribute :)
> To implement this feature, I think you need to break this feature into
> following sub tasks.
> 1. You can extend ColumnPageCodec to implement XOR encoding.
> 2. Come up with the criteria of how to select this encoding and change
> behavior of DefaultEncodingStrategy
> 3. SQL syntax for this encoding.
>
> The encoding override work is still going on. The SQL syntax part is
> missing, so the point 3 can be done later.
>
>
> Regards,
> Jacky
>
> > 在 2017年7月5日,下午3:32,Geetika Gupta  写道:
> >
> > Hi Community,
> >
> > I was looking into CARBONDATA-1128
> > . The document
> > attached with the Jira describes about compression of timestamp and
> decimal
> > values. The decimal values are compressed using XOR. So I would like to
> > contribute to one of its subtask i.e. CARBONDATA-1130
> > .
> >
> > --
> > Regards,
> > Geetika Gupta
>
>
>
>


-- 
Regards,
Geetika Gupta


Re: XOR encoding for floating point

2017-07-05 Thread Jacky Li
+1

Feel free to contribute :)
To implement this feature, I think you need to break this feature into 
following sub tasks.
1. You can extend ColumnPageCodec to implement XOR encoding.
2. Come up with the criteria of how to select this encoding and change behavior 
of DefaultEncodingStrategy
3. SQL syntax for this encoding. 

The encoding override work is still going on. The SQL syntax part is missing, 
so the point 3 can be done later.


Regards,
Jacky

> 在 2017年7月5日,下午3:32,Geetika Gupta  写道:
> 
> Hi Community,
> 
> I was looking into CARBONDATA-1128
> . The document
> attached with the Jira describes about compression of timestamp and decimal
> values. The decimal values are compressed using XOR. So I would like to
> contribute to one of its subtask i.e. CARBONDATA-1130
> .
> 
> -- 
> Regards,
> Geetika Gupta





Re: XOR encoding for floating point

2017-07-05 Thread Liang Chen
Hi Geetika

Very happy to see that you are interested in contributing this feature.
Please have the design discussion before you start to code.

Regards
Liang


Geetika Gupta wrote
> Hi Community,
> 
> I was looking into CARBONDATA-1128
> https://issues.apache.org/jira/browse/CARBONDATA-1128;. The
> document
> attached with the Jira describes about compression of timestamp and
> decimal
> values. The decimal values are compressed using XOR. So I would like to
> contribute to one of its subtask i.e. CARBONDATA-1130
> https://issues.apache.org/jira/browse/CARBONDATA-1130;.
> 
> -- 
> Regards,
> Geetika Gupta





--
View this message in context: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/XOR-encoding-for-floating-point-tp17347p17394.html
Sent from the Apache CarbonData Dev Mailing List archive mailing list archive 
at Nabble.com.


[jira] [Created] (CARBONDATA-1130) XOR encoding for floating point

2017-06-05 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-1130:


 Summary: XOR encoding for floating point
 Key: CARBONDATA-1130
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1130
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Jacky Li


In case of timeseries data, the measure value in the record may be almost same. 
XOR encoding can be used to compress this type of data.
see http://www.vldb.org/pvldb/vol8/p1816-teller.pdf




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)