error occur when I load data to s3

2018-08-29 Thread aaron
/* spark.executor.extraClassPath file:///usr/local/Cellar/apache-spark/2.2.1/lib/* lib folder include below jars -rw-r--r--@ 1 aaron staff52M Aug 29 20:50 apache-carbondata-1.4.1-bin-spark2.2.1-hadoop2.7.2.jar -rw-r--r-- 1 aaron staff 764K Aug 29 21:33 httpclient-4.5.4.jar -rw-r--r-- 1 aaron staff 314K Aug

Re: [DISCUSSION] Support Standard Spark's FileFormat interface in Carbondata

2018-08-31 Thread aaron
Does this means that we could call carbon in pyspark? -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

master timeSeries DATAMAP does not work well as 1.4.1

2018-09-04 Thread aaron
e_code, country_code, category_id, product_id, sum(revenue), timeseries(date, 'year') from test_store_int group by timeseries(date, 'year'), market_code, device_code, country_code, category_id, product_id""".stripMargin).show(200, truncate=false) |== CarbonData Profiler

Re: [Serious Issue] Rows disappeared

2018-09-27 Thread aaron
** a) First can you disable local dictionary and try the same scenario? I would test in other time Good idea, and I think this works, when I use global dictionary, query can return right result. But the question is, global

Re: [Serious Issue] Rows disappeared

2018-09-27 Thread aaron
Another comment, this issue can be reproduces on spark2.3.1 + carbondata1.5.0, spark2.2.2 + carbondata1.5.0, I can send you the jar I compiled to you, hope this could help you. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Serious Issue] Rows disappeared

2018-09-27 Thread aaron
Yes, you're right. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Serious Issue] Rows disappeared

2018-09-27 Thread aaron
This is the method I construct carbon instance, hope this can help you. def carbonSession(appName: String, masterUrl: String, parallelism: String, logLevel: String, hdfsUrl: String="hdfs://ec2-dca-aa-p-sdn-16.appannie.org:9000"): SparkSession = { val storeLocation =

Re: [Serious Issue] Rows disappeared

2018-09-27 Thread aaron
@Ajantha, Great! looking forward to your fix:) -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [ISSUE] carbondata1.5.0 and spark 2.3.2 query plan issue

2018-10-05 Thread aaron
Data should be right. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [ISSUE] carbondata1.5.0 and spark 2.3.2 query plan issue

2018-09-30 Thread aaron
Hi xm_zzc, Thanks for you response. I test based on 2.3.2, not test 2.2.2. And I have merged the fix come from ISSUE http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Issue-Dictionary-and-S3-td63106.html and

Re: [ISSUE] carbondata1.5.0 and spark 2.3.2 query plan issue

2018-10-01 Thread aaron
d AND r.device_code=u.device_code ) AS a )AS f ORDER BY f.arpu DESC LIMIT 10 Thanks Aaron -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Issues about dictionary and S3

2018-09-29 Thread aaron
Wow, cool! I will have a try! -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Serious Issue] Rows disappeared

2018-09-29 Thread aaron
Cool! It works now. Thanks a lot! -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Minor Issue] BETWEEN AND does work as expected

2018-09-29 Thread aaron
Cool! Thanks a lot for your effort! -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Serious Issue] Rows disappeared

2018-09-28 Thread aaron
Great and I will have a try later -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

[Issue] Long string columns config for big strings not work

2018-10-10 Thread aaron
Hi Community, I encounter a issue, the LONG_STRING_COLUMNS config for big strings not work. My env is spark2.3.2 + carbon 1.5.0 1. DDL Sql carbon.sql( s""" |CREATE TABLE IF NOT EXISTS product( |market_code STRING, |product_id LONG, |country_code STRING,

Re: [Issue] Long string columns config for big strings not work

2018-10-10 Thread aaron
Hi Community, I found that if I match the table columns order and dataframe order through below way, then it works. _df .select( "market_code", "product_id", "country_code", "category_id", "company_id", "name", "company", "release_date", "price", "version",

Re: [Issue] Long string columns config for big strings not work

2018-10-11 Thread aaron
Thanks, I will have a try -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [ISSUE] carbondata1.5.0 and spark 2.3.2 query plan issue

2018-09-30 Thread aaron
Screen_Shot_2018-09-30_at_5.png -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [ISSUE] carbondata1.5.0 and spark 2.3.2 query plan issue

2018-09-30 Thread aaron
Screen_Shot_2018-09-30_at_5.png -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: error occur when I load data to s3

2018-09-02 Thread aaron
.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 18/09/02 21:49:47 AUDIT CarbonLoadDataCommand: [aaron.local][aaron][Thread-1]Dataload failure for default.test_s3_table. Please check the logs 18/09/02 21:4

Re: error occur when I load data to s3

2018-09-03 Thread aaron
Thanks, I will have a try. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: error occur when I load data to s3

2018-09-03 Thread aaron
Thanks, I will have a try! -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: error occur when I load data to s3

2018-09-03 Thread aaron
Compile failed. My env is, aaron:carbondata aaron$ java -version java version "1.8.0_144" Java(TM) SE Runtime Environment (build 1.8.0_144-b01) Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode) aaron:carbondata aaron$ mvn -v Apache M

Re: error occur when I load data to s3

2018-09-03 Thread aaron
Hi kunalkapoor, It seems that error not fixed yet. Do you have any idea? thanks aaron aaron:2.2.1 aaron$ spark-shell --executor-memory 4g --driver-memory 2g Ivy Default Cache set to: /Users/aaron/.ivy2/cache The jars for the packages stored in: /Users/aaron/.ivy2/jars :: loading settings

Re: error occur when I load data to s3

2018-09-04 Thread aaron
D0240EBDD2234 18/09/04 14:45:10 DEBUG PoolingClientConnectionManager: Connection [id: 1][route: {s}->https://aa-sdk-test2.s3.us-east-1.amazonaws.com:443] can be kept alive indefinitely 18/09/04 14:45:10 DEBUG PoolingClientConnectionManager: Connection released: [id: 1][route: {s}->https://aa-sdk-t

Re: Issues about dictionary and S3

2018-09-24 Thread aaron
3. carbondata1.5.0-SNAPSHOT & spark2.2.2 4. carbondata1.5.0-SNAPSHOT & spark2.3.1 We would use many preaggregate tables in our business, and filter & join would be very common cases for us. Looking forward to your good news. Thanks Aaron -- Sent from: http://apache-carbon

Re: Issues about dictionary and S3

2018-09-24 Thread aaron
Hi kunalkapoor, More info for you. *1. One comment about how to reproduce this *- query was distributed to spark workers on different nodes for execution. *2. Detailed stacktrace* scala> carbon.time(carbon.sql( | s"""SELECT sum(est_free_app_download), timeseries(date, 'MONTH'),

Bloomfilter datamap with pre agg datamap will break normal group by query

2018-09-24 Thread aaron
Hi Community, I found that the Bloomfilter datamap with pre agg datamap will break normal group by query. When I drop the bloom filter datamap, then query works. * Demo SQL: CREATE TABLE IF NOT EXISTS store(

[Serious Issue] Rows disappeared

2018-09-26 Thread aaron
Hi Community, It seems that rows disappeared, same query get different result carbon.time(carbon.sql( s""" |EXPLAIN SELECT date, market_code, device_code, country_code, category_id, product_id, est_free_app_download, est_paid_app_download, est_revenue |FROM store

Re: Issues about dictionary and S3

2018-09-26 Thread aaron
Thanks, I've check already and it works well! Very impressive quick response ! -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

[Issue] Load auto compaction failed

2018-09-26 Thread aaron
Hi community, Based on 1.5.0 - the load with local dictionary and local sort, the load failed when data count arrive 0.5 billion, but I've already load 50billion before with global dictionary and sort. Do you have any ideas? 18/09/26 08:39:45 AUDIT CarbonTableCompactor:

Re: [Issue] Bloomfilter datamap

2018-09-25 Thread aaron
Great! thanks for your so quick response! I will have a try. Do you mean that I merge https://github.com/apache/carbondata/pull/2665? Thanks aaron -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Issue] Bloomfilter datamap

2018-09-25 Thread aaron
I use 1.5.0-SNAPSHOT, but I'm not sure about 1.4.1 (I forget that I have test it or not) -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Issue] Bloomfilter datamap

2018-09-25 Thread aaron
Based on that fix, drop existed table and data, re-creating the table and datamap is exactly as you said, no problem. But I did not delete the data and table yesterday, just create a new datamap, there will be some problems. -- Sent from:

Re: [Issue] Bloomfilter datamap

2018-09-25 Thread aaron
Yes, you're right. The fix make master work now. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Issues about dictionary and S3

2018-09-25 Thread aaron
Thanks a lot! Looking forward to your good news. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Issue] Bloomfilter datamap

2018-09-25 Thread aaron
But one more comment, it seems that bloomfilter datamap disappears from the query plan in detailed query? so what's the case which is for the bloomfilter? -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Issues about dictionary and S3

2018-09-25 Thread aaron
Hi kunalkapoor, Thanks very much for your quick response! 1. For the global dictionary issue, Do you have rough plan about the fix? 2. How's the local dictionary bug on spark 2.3.1? Looking forward to the fix! Thanks Aaron -- Sent from: http://apache-carbondata-dev-mailing-list-archive

Issues about dictionary and S3

2018-09-23 Thread aaron
Hi Community, I found some possible issues about dictionary and S3 compatibility during POC, and I attach them in CSV, could you please have a look at it? Thanks Aaron Possible_Issues.csv <http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/file/t357/Possible_Issues.

[Minor Issue] BETWEEN AND does work as expected

2018-09-27 Thread aaron
Hi Community, The BETWEEN AND work as >= AND <, I guess is should be >= AND <=. My env is spark2.2.2 + carbondata1.4.1 %Carbondata scala> carbon.time(carbon.sql( | s"""SELECT timeseries(date, 'DAY') as day, market_code, device_code, country_code, category_id, |

Re: [Issue] Load auto compaction failed

2018-09-27 Thread aaron
Good explanation, it works now! thanks -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Issue] Load auto compaction failed

2018-09-27 Thread aaron
Good suggestion, it works now! -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: CarbonData (1.5.2) TPCH Reports

2019-04-10 Thread aaron
Hi Gururaj, I did not see the points carbon is better than parquet from your report. So i'm wondering why we should use carbon not parquet? we all know parquet is more popular. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/