[jira] [Created] (CARBONDATA-955) CacheProvider test fails

2017-04-18 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-955: -- Summary: CacheProvider test fails Key: CARBONDATA-955 URL: https://issues.apache.org/jira/browse/CARBONDATA-955 Project: CarbonData Issue Type

[jira] [Created] (CARBONDATA-953) Add validations to Unsafe dataload. And control the data added to threads

2017-04-18 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-953: -- Summary: Add validations to Unsafe dataload. And control the data added to threads Key: CARBONDATA-953 URL: https://issues.apache.org/jira/browse/CARBONDATA-953

[jira] [Created] (CARBONDATA-915) Call getAll dictionary from codegen of dictionary decoder to improve dictionary load performance

2017-04-12 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-915: -- Summary: Call getAll dictionary from codegen of dictionary decoder to improve dictionary load performance Key: CARBONDATA-915 URL: https://issues.apache.org/jira

[jira] [Created] (CARBONDATA-893) MR testcase hangs in Hadoop 2.7.2 version profile

2017-04-10 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-893: -- Summary: MR testcase hangs in Hadoop 2.7.2 version profile Key: CARBONDATA-893 URL: https://issues.apache.org/jira/browse/CARBONDATA-893 Project

[jira] [Created] (CARBONDATA-874) select * from table order by limit query is failing

2017-04-05 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-874: -- Summary: select * from table order by limit query is failing Key: CARBONDATA-874 URL: https://issues.apache.org/jira/browse/CARBONDATA-874 Project

[VOTE] Apache CarbonData 1.1.0-incubating (RC1) release

2017-04-05 Thread Ravindra Pesala
Hi PPMC, I submit the Apache CarbonData 1.1.0-incubating (RC1) to your vote. Release Notes: *https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12320220=12338987 * Key features of this release are

[jira] [Created] (CARBONDATA-861) Improvements in query processing.

2017-04-05 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-861: -- Summary: Improvements in query processing. Key: CARBONDATA-861 URL: https://issues.apache.org/jira/browse/CARBONDATA-861 Project: CarbonData

Re: Re: Re: Optimize Order By + Limit Query

2017-03-29 Thread Ravindra Pesala
uce required blocklets. > > if you only apply spark's top N, I don't think you can make suck below > performance. > > That's impossible if don't reduce disk IO. > > > > > > > > > At 2017-03-30 03:12:54, "Ravindra Pesala" <ravi.pes...@gmai

Re: Load data into carbondata executors distributed unevenly

2017-03-29 Thread Ravindra Pesala
Hi, It seems attachments are missing.Can you attach them again. Regards, Ravindra. On 30 March 2017 at 08:02, a wrote: > Hello! > > *Test result:* > When I load csv data into carbondata table 3 times,the executors > distributed unevenly。My purpose >

Re: Re: Optimize Order By + Limit Query

2017-03-29 Thread Ravindra Pesala
spark to carbon > like aggregation, limit, topn etc. But later it was removed because it is > very hard to maintain for version to version. I feel it is better that > execution engine like spark can do these type of operations. > >>>>>>>>>>>

Re: Optimize Order By + Limit Query

2017-03-28 Thread Ravindra Pesala
Hi Jarck Ma, It is great to try optimizing Carbondata. I think this solution comes up with many limitations. What if the order by column is not the first column? It needs to scan all blocklets to get the data out of it if the order by column is not first column of mdk. We used to have multiple

Re: Re:Re:Re:Re: insert into carbon table failed

2017-03-26 Thread Ravindra Pesala
ction.Iterator$$anon$11.hasNext(Iterator. > scala:327) > >>at org.apache.carbondata.spark.rdd.NewRddIterator.hasNext( > NewCarbonDataLoadRDD.scala:412) > >>at org.apache.carbondata.processing.newflow.steps. > InputProcessorStepImpl$InputProcessorIterator.inte

Re: Re:Re:Re: insert into carbon table failed

2017-03-26 Thread Ravindra Pesala
FSInputStream.java:934) > >>at java.io.DataInputStream.readFully(DataInputStream.java:195) > >> at > >> org.apache.hadoop.hive.ql.io.orc.MetadataReader.readStripeFooter(MetadataReader.java:112) > >>at > >> org.apache.hadoop.hive.ql.

[jira] [Created] (CARBONDATA-822) Add unsafe sort for bucketing feature

2017-03-26 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-822: -- Summary: Add unsafe sort for bucketing feature Key: CARBONDATA-822 URL: https://issues.apache.org/jira/browse/CARBONDATA-822 Project: CarbonData

[jira] [Created] (CARBONDATA-821) Remove Kettle related code and flow from carbon.

2017-03-26 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-821: -- Summary: Remove Kettle related code and flow from carbon. Key: CARBONDATA-821 URL: https://issues.apache.org/jira/browse/CARBONDATA-821 Project

[DISCUSSION] Initiating Apache CarbonData-1.1.0 incubating Release

2017-03-25 Thread Ravindra Pesala
Hi All, As planned we are going to release Apache CarbonData-1.1.0. Please discuss and vote for it to initiate 1.1.0 release, i will start to prepare the release after 3-days of discussion. It will have following features. 1. Introduced new data format called V3(version 3). Improves the

Re: insert into carbon table failed

2017-03-25 Thread Ravindra Pesala
Hi, Carbodata launches one job per each node to sort the data at node level and avoid shuffling. Internally it uses threads to use parallel load. Please use carbon.number.of.cores.while.loading property in carbon.properties file and set the number of cores it should use per machine while loading.

[jira] [Created] (CARBONDATA-809) Union with alias is returning wrong result.

2017-03-23 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-809: -- Summary: Union with alias is returning wrong result. Key: CARBONDATA-809 URL: https://issues.apache.org/jira/browse/CARBONDATA-809 Project: CarbonData

Re: [PROPOSAL] Update on the Jenkins CarbonData job

2017-03-19 Thread Ravindra Pesala
+1 Regards, Ravindra. On 19 March 2017 at 11:14, Liang Chen wrote: > +1 > Thanks, JB. > > Regards > Liang > > 2017-03-17 22:48 GMT+08:00 Jean-Baptiste Onofré : > > > Hi guys, > > > > Tomorrow I plan to update our jobs on Apache Jenkins as the

[jira] [Created] (CARBONDATA-793) Count with null values is giving wrong result.

2017-03-18 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-793: -- Summary: Count with null values is giving wrong result. Key: CARBONDATA-793 URL: https://issues.apache.org/jira/browse/CARBONDATA-793 Project: CarbonData

[jira] [Created] (CARBONDATA-791) Exists queries of TPC-DS are failing in carbon

2017-03-18 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-791: -- Summary: Exists queries of TPC-DS are failing in carbon Key: CARBONDATA-791 URL: https://issues.apache.org/jira/browse/CARBONDATA-791 Project: CarbonData

[jira] [Created] (CARBONDATA-786) Data mismatch if the data data is loaded across blocklet groups

2017-03-16 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-786: -- Summary: Data mismatch if the data data is loaded across blocklet groups Key: CARBONDATA-786 URL: https://issues.apache.org/jira/browse/CARBONDATA-786

Re: 【DISCUSS】add more index for sort columns

2017-03-14 Thread Ravindra Pesala
Hi Bill, Min/max for measure columns are already added in V3 format. Now measure columns filters are being added now so it does block and blocklet pruning based on min/max to reduce IO and processing. And as per your suggestions, column need to be sorted and maintain multiple ranges in metadata.

[jira] [Created] (CARBONDATA-771) Dataloading fails in V3 format for TPC-DS data.

2017-03-14 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-771: -- Summary: Dataloading fails in V3 format for TPC-DS data. Key: CARBONDATA-771 URL: https://issues.apache.org/jira/browse/CARBONDATA-771 Project: CarbonData

[jira] [Created] (CARBONDATA-769) Support Codegen in CarbonDictionaryDecoder

2017-03-14 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-769: -- Summary: Support Codegen in CarbonDictionaryDecoder Key: CARBONDATA-769 URL: https://issues.apache.org/jira/browse/CARBONDATA-769 Project: CarbonData

Re: column auto mapping when loading data from csv file

2017-03-12 Thread Ravindra Pesala
Hi Yinwei, Even I feel it is little cumbersome to let user forced to add the header to CSV file or to loading script. But what Manish said is also true. I think we should come with some new option in loading script to accept auto mapping of DDL columns and CSV columns. If user knows that DDL

Re: Removing of kettle code from Carbondata

2017-03-12 Thread Ravindra Pesala
Hi David, Thank you for your suggestion. All known and major flows are tested and already it is the default flow in current version. Please let us know when you finish the new flow testing completely after that we can initiate removing of kettle flow again. Regards, Ravindra. On 13 March 2017

Removing of kettle code from Carbondata

2017-03-10 Thread Ravindra Pesala
Hi All, I guess it is time to remove the kettle flow from Carbondata loading. Now there are two flows to load the data and becomes difficult to maintain the code.Bug fixing or any feature implementation needs to be done in both the places so it becomes difficult for developer to implement and

Re: Question related to lazy decoding optimzation

2017-03-08 Thread Ravindra Pesala
Hi Yong Zhang, Thank you for analyzing carbondata. Yes, lazy decoding is only possible if the dictionaries are global. At the time of loading the data it generates global dictionary values. There are 2 ways to generate global dictionary values. 1. Launch a job to read all input data and find the

Re: question about dimColumnExecuterInfo.getFilterKeys()

2017-03-08 Thread Ravindra Pesala
Hi, The filter values which we get from query will be converted to respective surrogates and sorted on surrogate values before start applying the filter. Regards, Ravindra On 8 March 2017 at 09:55, 马云 wrote: > Hi Dev, > > > when do filter query, I can see a filtered

[jira] [Created] (CARBONDATA-743) Remove the abundant class CarbonFilters.scala

2017-03-02 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-743: -- Summary: Remove the abundant class CarbonFilters.scala Key: CARBONDATA-743 URL: https://issues.apache.org/jira/browse/CARBONDATA-743 Project: CarbonData

[jira] [Created] (CARBONDATA-739) Avoid creating multiple instances of DirectDictionary in DictionaryBasedResultCollector

2017-03-02 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-739: -- Summary: Avoid creating multiple instances of DirectDictionary in DictionaryBasedResultCollector Key: CARBONDATA-739 URL: https://issues.apache.org/jira/browse

Re: Improving Non-dictionary storage & performance.

2017-03-02 Thread Ravindra Pesala
> > What is your idea? > > Regards, > Jacky > > > 在 2017年3月1日,下午11:31,Ravindra Pesala <ravi.pes...@gmail.com> 写道: > > > > Hi Vishal, > > > > You are right, thats why we can do no-dictionary only for String > datatype. > > Please loo

Re: [DISCUSS] Graduation to a TLP (Top Level Project)

2017-03-01 Thread Ravindra Pesala
+1 Its excited to see Carbondata project going for graduation to TLP. Our hard work is going to payoff soon. Thanks JB for taking this initiative. Regards, Ravindra. On 1 March 2017 at 15:50, Jean-Baptiste Onofré wrote: > Hi Liang, > > We are now good. I will update pull

Re: Improving Non-dictionary storage & performance.

2017-03-01 Thread Ravindra Pesala
shal <kumarvishal1...@gmail.com> wrote: > Hi Ravi, > Sorting of data for no dictionary should be based on data type + same for > filter . Please add this point. > > -Regards > Kumar Vishal > > On Wed, Mar 1, 2017 at 8:34 PM, Ravindra Pesala <ravi.pes...@gmail.com> >

Re: [DISCUSS] For the dimension default should be no dictionary

2017-03-01 Thread Ravindra Pesala
Hi All, In order to make no-dictionary columns as default we should improve the storage and performance for these columns. I have sent another mail to discuss the improvement points. Please comment on it. Regards, Ravindra On 1 March 2017 at 10:12, Ravindra Pesala <ravi.pes...@gmail.com>

Improving Non-dictionary storage & performance.

2017-03-01 Thread Ravindra Pesala
Hi, In order to make non-dictionary columns storage and performance more efficient, I am suggesting following improvements. 1. Make always SHORT, INT, BIGINT, DOUBLE & FLOAT always direct dictionary. Right now only date and timestamp are direct dictionary columns. We can make SHORT, INT,

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-28 Thread Ravindra Pesala
dentify the high card columns. I feel > preventing such misusage is important in order to encourage more users to > use carbondata. > > Any suggestion on solving this issue? > > > Regards, > Likun > > > > 在 2017年2月28日,下午10:20,Ravindra Pesala <ravi.pes...@gmail.c

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-28 Thread Ravindra Pesala
Hi Likun, You mentioned that if user does not specify dictionary columns then by default those are chosen as no dictionary columns. But we have many disadvantages as I mentioned in above mail if you keep no dictionary as default. We have initially introduced no dictionary columns to handle high

Re: Block B-tree loading failed

2017-02-28 Thread Ravindra Pesala
Hi, Have you loaded data freshly and try to execute the query? Or you are trying to query the old store you already has loaded? Regards, Ravindra. On 28 February 2017 at 17:20, ericzgy <1987zhangguang...@163.com> wrote: > Now when I load data into CarbonData table using spark1.6.2 and >

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-27 Thread Ravindra Pesala
e of main advantage, no > > dictionary column aggregation will be slower. Filter query will suffer as > > in case of dictionary column we are comparing on byte pack value, in case > > of no dictionary it will be on actual value. > > > > -Regards > > Kumar V

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-26 Thread Ravindra Pesala
Hi, I feel there are more disadvantages than advantages in this approach. In your current scenario you want to set dictionary only for columns which are used as filters, but the usage of dictionary is not only limited for filters, it can reduce the store size and improve the aggregation queries.

Re: Exception throws when I load data using carbondata-1.0.0

2017-02-21 Thread Ravindra Pesala
Hi, Please create the carbon context as follows. val cc = new CarbonContext(sc, storeLocation) Here storeLocation is hdfs://hacluster/tmp/carbondata/carbon.store in your case. Regards, Ravindra On 21 February 2017 at 08:30, Ravindra Pesala <ravi.pes...@gmail.com> wrote: > Hi, &g

Re: [ANNOUNCE] Hexiaoqiao as new Apache CarbonData committer

2017-02-20 Thread Ravindra Pesala
Congratulations Hexiaoqiao. Regards, Ravindra. On 21 February 2017 at 10:15, Xiaoqiao He wrote: > Hi PPMC, Liang, > > It is my honor that receive the invitation, and very happy to have chance > that participate to build CarbonData community also. I will keep > contributing

[jira] [Created] (CARBONDATA-715) Optimize Single pass data load

2017-02-20 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-715: -- Summary: Optimize Single pass data load Key: CARBONDATA-715 URL: https://issues.apache.org/jira/browse/CARBONDATA-715 Project: CarbonData Issue

Re: Exception throws when I load data using carbondata-1.0.0

2017-02-20 Thread Ravindra Pesala
pache.spark.sql.execution.command.LoadTable. > run(carbonTableSchema.scala:360) > > at > > org.apache.spark.sql.execution.ExecutedCommand. > sideEffectResult$lzycompute(commands.scala:58) > > at > > org.apache.spark.sql.execution.ExecutedCommand. > sideEffectResult(commands.scala:56) > > a

Re: Exception throws when I load data using carbondata-1.0.0

2017-02-17 Thread Ravindra Pesala
Hi Xiaoqiao, Is the problem still exists? Can you try with clean build with "mvn clean -DskipTests -Pspark-1.6 package" command. Regards, Ravindra. On 16 February 2017 at 08:36, Xiaoqiao He wrote: > hi Liang Chen, > > Thank for your help. It is true that i install and

Re: question about the order between original values and its encoded values

2017-02-16 Thread Ravindra Pesala
Hi, Yes, it works because we are sorting the column values before assigning dictionary values to it. So it can work only if you have loaded the data only once( it means there is no incremental load). If you do incremental load and some more dictionary values are added to store then there is no

Re: whether carbondata can be used in hive on spark?

2017-02-16 Thread Ravindra Pesala
Hi, We have so far integrated only to the Spark, not yet integrated to Hive. So carbondata cannot be used in Hive on Spark at this moment. Regards, Ravindra. On 16 February 2017 at 14:35, wangzheng <18031...@qq.com> wrote: > we use cdh5.7, it remove the thriftserver of spark, so sparksql is

Re: 回复: data lost when loading data from csv file to carbon table

2017-02-16 Thread Ravindra Pesala
Hi QiangCai, PR594 fix does not solve the data lost issue, it fixes the data mismatch in some cases. Regards, Ravindra. On 16 February 2017 at 09:35, QiangCai wrote: > Maybe you can check PR594, it will fix a bug which will impact the result > of > loading. > > > > -- > View

Re: data lost when loading data from csv file to carbon table

2017-02-15 Thread Ravindra Pesala
sk, ws_web_page_sk, ws_web_site_sk, ws_ship_mode_sk, > ws_warehouse_sk, ws_promo_sk, ws_order_number')"); > > > > and here is my script for generate tpc-ds data: > [hadoop@master tools]$ ./dsdgen -scale 1 -suffix '.csv' -dir > /data/tpc-ds/data/ > > > > > > >

Re: 回复: data lost when loading data from csv file to carbon table

2017-02-15 Thread Ravindra Pesala
Hi Yinwei, Can you provide create table scripts for both the tables store_returns and web_sales. Regards, Ravindra. On 16 February 2017 at 10:07, Ravindra Pesala <ravi.pes...@gmail.com> wrote: > Hi Yinwei, > > Thank you for pointing out the issue, I will check with TPC-DS da

Re: 回复: data lost when loading data from csv file to carbon table

2017-02-15 Thread Ravindra Pesala
Hi Yinwei, Thank you for pointing out the issue, I will check with TPC-DS data and verify the data load with new flow. Regards, Ravindra. On 16 February 2017 at 09:35, QiangCai wrote: > Maybe you can check PR594, it will fix a bug which will impact the result > of > loading.

Re: Introducing V3 format.

2017-02-15 Thread Ravindra Pesala
k pruning and less number of false positive blocks will improve the > >>filter query performance. Separating uncompression of data from reader > >>layer will improve the overall query performance. > >> > >>-Regards > >>Kumar Vishal > >> &g

Re: data lost when loading data from csv file to carbon table

2017-02-15 Thread Ravindra Pesala
t > > > the configuration in my carbon.properties is: > carbon.kettle.home=/opt/spark-2.1.0/carbonlib/carbonplugins, but it seems > not work. > > > how can I solve this problem. > > > -- > > > Hi Liang Chen, > > > would you add a more d

Re: Introducing V3 format.

2017-02-15 Thread Ravindra Pesala
Please find the thrift file in below location. https://drive.google.com/open?id=0B4TWTVbFSTnqZEdDRHRncVItQ242b1NqSTU2b2g4dkhkVDRj On 15 February 2017 at 17:14, Ravindra Pesala <ravi.pes...@gmail.com> wrote: > Problems in current format. > 1. IO read is slower since it needs to go

Re: data lost when loading data from csv file to carbon table

2017-02-14 Thread Ravindra Pesala
Hi, Please set carbon.badRecords.location in carbon.properties and check any bad records are added to that location. Regards, Ravindra. On 14 February 2017 at 15:24, Yinwei Li <251469...@qq.com> wrote: > Hi all, > > > I met an data lost problem when loading data from csv file to carbon >

[jira] [Created] (CARBONDATA-702) Create carbondata repository to keep format jar

2017-02-11 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-702: -- Summary: Create carbondata repository to keep format jar Key: CARBONDATA-702 URL: https://issues.apache.org/jira/browse/CARBONDATA-702 Project: CarbonData

Re: Discussion about getting excution duration about a query when using sparkshell+carbondata

2017-02-08 Thread Ravindra Pesala
Hi Libis, spark-sql CLI is not supported by carbondata. Why don't you use carbon thrift server and beeline, it is also same as spark-sql CLI and it gives execution time for each query. Start carbondata thrift server script. bin/spark-submit --class

Re: query exception: Path is not a file when carbon 1.0.0

2017-02-08 Thread Ravindra Pesala
Hi, This exception is actually ignored in class SegmentUpdateStatusManager line number 696. This exception does not create any problem. Usually this exception won't be printed in any server logs as we are ignoring it. May be in spark-shell it is printing. we will look into it. Regards, Ravindra.

Re: Aggregate performace

2017-02-08 Thread Ravindra Pesala
Hi, The performance is depends on the query plan, when you submit the query like [Select attributeA , count(*) from tableB group by attributeA] in case of spark it asks carbon to give only attributeA column. So Carbon reads only attributeA column from all files send the result to spark to

[jira] [Created] (CARBONDATA-692) Support scalar subquery in carbon

2017-02-01 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-692: -- Summary: Support scalar subquery in carbon Key: CARBONDATA-692 URL: https://issues.apache.org/jira/browse/CARBONDATA-692 Project: CarbonData

[jira] [Created] (CARBONDATA-680) Add stats like rows processed in each step. And also fix unsafe sort enable issue.

2017-01-25 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-680: -- Summary: Add stats like rows processed in each step. And also fix unsafe sort enable issue. Key: CARBONDATA-680 URL: https://issues.apache.org/jira/browse/CARBONDATA

Re: [VOTE] Apache CarbonData 1.0.0-incubating release (RC2)

2017-01-20 Thread Ravindra Pesala
+1 Done sanity for all major features, it is fine. Regards, Ravindra. On Sat, Jan 21, 2017, 07:51 Liang Chen wrote: > +1(binding) > > I checked: > - name contains incubating > - disclaimer exists > - signatures and hash correct > - NOTICE good > - LICENSE is good > -

Re: Re: Failed to APPEND_FILE, hadoop.hdfs.protocol.AlreadyBeingCreatedException

2017-01-20 Thread Ravindra Pesala
Hi, Please use "mvn clean -DskipTests -Pspark-1.5 -Dspark.version=1.5.2 -Phadoop-2.7.2 package" Regards, Ravindra On 20 January 2017 at 15:42, manish gupta wrote: > Can you try compiling with hadoop-2.7.2 version and use it and let us know > if the issue still

[jira] [Created] (CARBONDATA-656) Simplify the carbon session creation

2017-01-17 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-656: -- Summary: Simplify the carbon session creation Key: CARBONDATA-656 URL: https://issues.apache.org/jira/browse/CARBONDATA-656 Project: CarbonData

[jira] [Created] (CARBONDATA-655) Make nokettle dataload flow as default in carbon

2017-01-17 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-655: -- Summary: Make nokettle dataload flow as default in carbon Key: CARBONDATA-655 URL: https://issues.apache.org/jira/browse/CARBONDATA-655 Project

Re: Unable to Assign Jira to me

2017-01-13 Thread Ravindra Pesala
Please provide Jira user name and mail id. We will add you as a contributor so that you can assign issues to yourself. On Fri, Jan 13, 2017, 16:49 Anurag Srivastava wrote: > Hello Team, > > I am working on JIRA [CARBONDATA-542] and want to assign this JIRA to me. > But I am

[jira] [Created] (CARBONDATA-628) Issue when measure selection with out table order gives wrong result with vectorized reader enabled

2017-01-11 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-628: -- Summary: Issue when measure selection with out table order gives wrong result with vectorized reader enabled Key: CARBONDATA-628 URL: https://issues.apache.org/jira

Re: TestCase failed

2017-01-10 Thread Ravindra Pesala
Hi, Please make sure the store path of "flightdb2" is given properly in side CarbonInputMapperTest class. Please provide complete stack trace of error. On 10 January 2017 at 17:54, 彭 wrote: > Hi,all: > Recently, i meet a failed TestCase, Is there anyone know it? >

[jira] [Created] (CARBONDATA-618) Add new profile to build all modules for release purpose

2017-01-10 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-618: -- Summary: Add new profile to build all modules for release purpose Key: CARBONDATA-618 URL: https://issues.apache.org/jira/browse/CARBONDATA-618 Project

[jira] [Created] (CARBONDATA-611) mvn clean -Pbuild-with-format package does not work

2017-01-09 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-611: -- Summary: mvn clean -Pbuild-with-format package does not work Key: CARBONDATA-611 URL: https://issues.apache.org/jira/browse/CARBONDATA-611 Project

Re: Select query is not working.

2017-01-05 Thread Ravindra Pesala
Hi, Its an issue, we are working on the fix. On 5 January 2017 at 17:26, Anurag Srivastava wrote: > Hello, > > I have taken latest code at today (5/01/2017) and build code with spark > 1.6. After that I put the latest jar in carbonlib in spark and start thrift > server. > >

Re: carbon shell is not working with spark 2.0 version

2017-01-03 Thread Ravindra Pesala
Yes, it is not working because the support is not yet added, right now it is low priority task as user can directly use spark-shell to create carbonsession and execute the queries. On 4 January 2017 at 12:40, anubhavtarar wrote: > carbon shell is not working with spark

[jira] [Created] (CARBONDATA-580) Support Spark 2.1 in Carbon

2016-12-30 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-580: -- Summary: Support Spark 2.1 in Carbon Key: CARBONDATA-580 URL: https://issues.apache.org/jira/browse/CARBONDATA-580 Project: CarbonData Issue

[jira] [Created] (CARBONDATA-577) Carbon session is not working in spark shell.

2016-12-28 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-577: -- Summary: Carbon session is not working in spark shell. Key: CARBONDATA-577 URL: https://issues.apache.org/jira/browse/CARBONDATA-577 Project: CarbonData

[jira] [Created] (CARBONDATA-574) Add thrift server support to Spark 2.0 carbon integration

2016-12-27 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-574: -- Summary: Add thrift server support to Spark 2.0 carbon integration Key: CARBONDATA-574 URL: https://issues.apache.org/jira/browse/CARBONDATA-574 Project

Re: CatalystAnalysy

2016-12-27 Thread Ravindra Pesala
Have you used 'mvn clean'? On 28 December 2016 at 07:18, rahulforallp wrote: > hey QiangCai, > thank you for your reply . i have spark 1.6.2. and also tried with > -Dspark.version=1.6.2 . But result is same . Still i am getting same > exception. > > Is this exception

Re: Float Data Type Support in carbondata Querry

2016-12-27 Thread Ravindra Pesala
Hi, >From carbon it supposed to return float data when you use float data type. Please check whether you are converting the data to float or not in ScannedResultCollector implementation classes. Regards, Ravindra On 27 December 2016 at 20:23, Rahul Kumar wrote: > Hello

Re: Dictionary file is locked for updation

2016-12-27 Thread Ravindra Pesala
Hi, It seems the store path location is taking default location. Did you set the store location properly? Which spark version you are using? Regards, Ravindra On Tue, Dec 27, 2016, 1:38 PM 251469031 <251469...@qq.com> wrote: > Hi Kumar, > > > thx to your repley, the full logs is as follows:

[jira] [Created] (CARBONDATA-547) Add CarbonSession and enabled parser to use all carbon commands

2016-12-21 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-547: -- Summary: Add CarbonSession and enabled parser to use all carbon commands Key: CARBONDATA-547 URL: https://issues.apache.org/jira/browse/CARBONDATA-547

Re: [DISCUSSION] CarbonData loading solution discussion

2016-12-15 Thread Ravindra Pesala
+1 to have separate output formats, now user can have flexibility to choose as per scenario. On Fri, Dec 16, 2016, 2:47 AM Jihong Ma wrote: > > It is great idea to have separate OutputFormat for regular Carbon data > files, index files as well as meta data files, For

[jira] [Created] (CARBONDATA-519) Enable vector reader in Carbon-Spark 2.0 integration and Carbon layer

2016-12-10 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-519: -- Summary: Enable vector reader in Carbon-Spark 2.0 integration and Carbon layer Key: CARBONDATA-519 URL: https://issues.apache.org/jira/browse/CARBONDATA-519

Re: select return error when filter string column in where clause

2016-12-05 Thread Ravindra Pesala
Hi, Please provide table schema, load command and sample data to reproduce this issue, you may create the JIRA for it. Regards, Ravi On 6 December 2016 at 07:05, Lu Cao wrote: > Hi Dev team, > I have loaded some data into carbondata table. But when I put the id >

Re: About hive integration

2016-12-04 Thread Ravindra Pesala
Hi, Yes, we have plans for integrating carbondata to hive engine but it is not our high priority work now so we will take it up this task gradually. Any contributions towards it are welcome. Regards, Ravi On 4 December 2016 at 12:30, Sea <261810...@qq.com> wrote: > Hi, all: > Now

Re: Why INT type is stored like BIGINT?

2016-12-04 Thread Ravindra Pesala
Hi, Since we use delta compression for measure types in carbondata , it stores the data with least datatype as per the values in blocklet. So it does not matter whether we store INT or BIGINT in carbondata files, it always use least datatype to store. Regards, Ravi On 4 December 2016 at 13:28,

Re: Question about RLE support in CarbonData

2016-11-30 Thread Ravindra Pesala
Hi, Here some encodings can be done on each field level and some can be done on blocklet(batch of column data) level. So DICTIONARY encoding is done on each field level and this FieldConverter is only encoding data on field level. RLE is applied on blocklet level so it is applied while writing

[jira] [Created] (CARBONDATA-469) Optimize join in spark using bucketing information

2016-11-29 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-469: -- Summary: Optimize join in spark using bucketing information Key: CARBONDATA-469 URL: https://issues.apache.org/jira/browse/CARBONDATA-469 Project

[jira] [Created] (CARBONDATA-468) Add pruning in driver side to improve query performance

2016-11-29 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-468: -- Summary: Add pruning in driver side to improve query performance Key: CARBONDATA-468 URL: https://issues.apache.org/jira/browse/CARBONDATA-468 Project

[jira] [Created] (CARBONDATA-467) Add bucketing information while creating table and update in thrift format.

2016-11-29 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-467: -- Summary: Add bucketing information while creating table and update in thrift format. Key: CARBONDATA-467 URL: https://issues.apache.org/jira/browse/CARBONDATA-467

[jira] [Created] (CARBONDATA-466) Implement bucketing table in carbondata

2016-11-29 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-466: -- Summary: Implement bucketing table in carbondata Key: CARBONDATA-466 URL: https://issues.apache.org/jira/browse/CARBONDATA-466 Project: CarbonData

[jira] [Created] (CARBONDATA-456) Select count(*) from table is slower.

2016-11-27 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-456: -- Summary: Select count(*) from table is slower. Key: CARBONDATA-456 URL: https://issues.apache.org/jira/browse/CARBONDATA-456 Project: CarbonData

Re: [New Feature] Adding bucketed table feature to Carbondata

2016-11-27 Thread Ravindra Pesala
uti...@gmail.com> wrote: > How is this different from partitioning? > On Sun, 27 Nov 2016 at 11:21 PM, Ravindra Pesala <ravi.pes...@gmail.com> > wrote: > > > Hi All, > > > > Bucketing concept is based on the hash partition the bucketed column as > per >

[improvement] Support unsafe in-memory sort in carbondata

2016-11-27 Thread Ravindra Pesala
Hi All, In the current carbondata system loading performance is not so encouraging since we need to sort the data at executor level for data loading. Carbondata collects batch of data and sorts before dumping to the temporary files and finally it does merge sort from those temporary files to

[New Feature] Adding bucketed table feature to Carbondata

2016-11-27 Thread Ravindra Pesala
Hi All, Bucketing concept is based on the hash partition the bucketed column as per configured bucket numbers. Records with same bucketed column always goes to the same same bucket. Physically each bucket is a file/files in table directory. Advantages Bucketed table is useful feature to do the

Re: Using DataFrame to write carbondata file cause no table found error

2016-11-25 Thread Ravindra Pesala
Hi, In Append mode , the carbon table supposed to be created before other wise load fails as Table do not exist. In Overwrite mode the carbon table would be created (it drops if it already exists) and loads the data. But in your case for overwrite mode it creates the table but it says table not

Re: CarbonData propose major version number increment for next version (to 1.0.0)

2016-11-24 Thread Ravindra Pesala
+1 On Thu, Nov 24, 2016, 10:37 PM manish gupta wrote: > +1 > > Regards > Manish Gupta > > On Thu, Nov 24, 2016 at 7:30 PM, Kumar Vishal > wrote: > > > +1 > > > > -Regards > > Kumar Vishal > > > > On Thu, Nov 24, 2016 at 2:41 PM, Raghunandan

Re: Please vote and advise on building thrift files

2016-11-16 Thread Ravindra Pesala
+1 for proposal 1 On 17 November 2016 at 08:23, Xiaoqiao He wrote: > +1 for proposal 1. > > On Thu, Nov 17, 2016 at 10:31 AM, ZhuWilliam > wrote: > > > +1 for proposal 1 . > > > > Auto generated code should not be added to project. Also most the of

Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

2016-11-14 Thread Ravindra Pesala
+1 On Mon, Nov 14, 2016, 3:54 PM sujith chacko wrote: > Hi liang, > Yes, its for high cardinality columns. > Thanks, > Sujith > > On Nov 14, 2016 2:01 PM, "Liang Chen" wrote: > > > Hi > > > > I have one query : for no dictionary columns

Single Pass Data Load Design

2016-11-13 Thread Ravindra Pesala
Hi All, Please find the proposed solutions for single pass data load. https://docs.google.com/document/d/1_sSN9lccCZo4E_X3pNP5PchQACqif3AOXKTuG-YJAcc/edit?usp=sharing -- Thanks & Regards, Ravindra

  1   2   3   >