date:20180208

[GitHub] carbondata issue #1948: [CARBONDATA-2143] Fixed query memory leak issue for ...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1948
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/3425/



---

[GitHub] carbondata issue #1952: [HotFix][CheckStyle] Fix import related checkstyle

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1952
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3580/



---

[GitHub] carbondata issue #1825: [CARBONDATA-2032][DataLoad] directly write carbon da...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1825
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3582/



---

[GitHub] carbondata issue #1952: [HotFix][CheckStyle] Fix import related checkstyle

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1952
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2342/



---

[GitHub] carbondata issue #1825: [CARBONDATA-2032][DataLoad] directly write carbon da...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1825
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2344/



---

[GitHub] carbondata pull request #1953: [CARBONDATA-2091][DataLoad] Support specifyin...

2018-02-08 Thread xuchuanyin

GitHub user xuchuanyin opened a pull request:

https://github.com/apache/carbondata/pull/1953

[CARBONDATA-2091][DataLoad] Support specifying sort column bounds in data 
loading

Enhance data loading performance by specifying sort column bounds
1. Add row range number during convert-process-step
2. Dispatch rows to each sorter by range number
3. Sort/Write process step can be done concurrently in each range

Tests added and docs updated

After implementing this feature, the data load performance has gained about 
25% enhancement (80MB/s/Node -> 102MB/s/Node) in my scenario with only 1 bounds 
provided. 

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [x] Any interfaces changed?
 `Only internal used interfaces are changed`
 - [x] Any backward compatibility impacted?
 `No`
 - [x] Document update required?
`Yes, added the usage of this feature to documents`
 - [x] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
`Yes`
- How it is tested? Please attach test report.
`Tested in 3-node cluster and local machine`
- Is it a performance related change? Please attach the performance 
test report.
`Yes. After implementing this feature, the data load performance has gained 
about 25% enhancement (80MB/s/Node -> 102MB/s/Node) in my scenario with only 1 
bounds provided. `
- Any additional information to help reviewers in testing this 
change.
   `I refactored the bucket related feature and treated the range and 
bucket as the similar logic`
 - [x] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
`Not related`


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xuchuanyin/carbondata 
0208_support_specifying_sort_column_bounds

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1953.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1953


commit 11463dd22db17f2e1858e0a1f3ebfeb07e3ec0e9
Author: xuchuanyin 
Date:   2018-02-08T08:30:09Z

Support specifying sort column bounds in data loading

Enhance data loading performance by specifying sort column bounds
1. Add row range number during convert-process-step
2. Dispatch rows to each sorter by range number
3. Sort/Write process step can be done concurrently in each range

Tests added and docs updated




---

[GitHub] carbondata issue #1951: [CARBONDATA-1763] Dropped table if exception thrown ...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1951
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3581/



---

[GitHub] carbondata issue #1953: [CARBONDATA-2091][DataLoad] Support specifying sort ...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1953
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2346/



---

[GitHub] carbondata issue #1951: [CARBONDATA-1763] Dropped table if exception thrown ...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1951
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2343/



---

[GitHub] carbondata issue #1808: [CARBONDATA-2023][DataLoad] Add size base block allo...

2018-02-08 Thread xuchuanyin

Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1808
  
this PR depends on #1952 


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

2018-02-08 Thread xuchuanyin

Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1792
  
this PR depends on #1952 


---

[GitHub] carbondata issue #1825: [CARBONDATA-2032][DataLoad] directly write carbon da...

2018-02-08 Thread xuchuanyin

Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1825
  
this PR depends on #1952 


---

[GitHub] carbondata issue #1953: [CARBONDATA-2091][DataLoad] Support specifying sort ...

2018-02-08 Thread xuchuanyin

Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1953
  
this PR depends on #1952 


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1792
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/3426/



---

[GitHub] carbondata issue #1951: [CARBONDATA-1763] Dropped table if exception thrown ...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1951
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2345/



---

[GitHub] carbondata issue #1951: [CARBONDATA-1763] Dropped table if exception thrown ...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1951
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3583/



---

[GitHub] carbondata issue #1949: [CARBONDATA2144] Optimize preaggregate table documen...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1949
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3584/



---

[GitHub] carbondata issue #1953: [CARBONDATA-2091][DataLoad] Support specifying sort ...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1953
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3586/



---

[GitHub] carbondata issue #1949: [CARBONDATA2144] Optimize preaggregate table documen...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1949
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2347/



---

[GitHub] carbondata issue #1935: [CARBONDATA-2134] Prevent implicit column filter lis...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1935
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3585/



---

[GitHub] carbondata issue #1808: [CARBONDATA-2023][DataLoad] Add size base block allo...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1808
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/3427/



---

[GitHub] carbondata issue #1935: [CARBONDATA-2134] Prevent implicit column filter lis...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1935
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2348/



---

[GitHub] carbondata pull request #1857: [WIP][CARBONDATA-2073][CARBONDATA-1516][Tests...

2018-02-08 Thread xubo245

Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1857#discussion_r166269467
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateLoad.scala
 ---
@@ -412,8 +430,467 @@ test("check load and select for avg double datatype") 
{
 sql(s"LOAD DATA LOCAL INPATH '$testData' into table maintable")
 sql(s"LOAD DATA LOCAL INPATH '$testData' into table maintable")
 val rows = sql("select age,avg(age) from maintable group by 
age").collect()
-sql("create datamap maintbl_douoble on table maintable using 
'preaggregate' as select avg(age) from maintable group by age")
+sql("create datamap maintbl_double on table maintable using 
'preaggregate' as select avg(age) from maintable group by age")
 checkAnswer(sql("select age,avg(age) from maintable group by age"), 
rows)
+sql("drop table if exists maintable ")
+  }
+
+  def testFunction(): Unit = {
+// check answer
+checkAnswer(sql(s"SELECT * FROM main_table_preagg_sum"),
+  Seq(Row(1, 31), Row(2, 27), Row(3, 70), Row(4, 55)))
+checkAnswer(sql(s"SELECT * FROM main_table_preagg_avg"),
+  Seq(Row(1, 31, 1), Row(2, 27, 1), Row(3, 70, 2), Row(4, 55, 2)))
+checkAnswer(sql(s"SELECT * FROM main_table_preagg_count"),
+  Seq(Row(1, 1), Row(2, 1), Row(3, 2), Row(4, 2)))
+checkAnswer(sql(s"SELECT * FROM main_table_preagg_min"),
+  Seq(Row(1, 31), Row(2, 27), Row(3, 35), Row(4, 26)))
+checkAnswer(sql(s"SELECT * FROM main_table_preagg_max"),
+  Seq(Row(1, 31), Row(2, 27), Row(3, 35), Row(4, 29)))
+
+// check select and match or not match pre-aggregate table
+checkPreAggTable(sql("SELECT id, SUM(age) FROM main_table GROUP BY 
id"),
+  true, "main_table_preagg_sum")
+checkPreAggTable(sql("SELECT id, SUM(age) FROM main_table GROUP BY 
id"),
+  false, "main_table_preagg_avg", "main_table")
+
+checkPreAggTable(sql("SELECT id, AVG(age) FROM main_table GROUP BY 
id"),
+  true, "main_table_preagg_avg")
+checkPreAggTable(sql("SELECT id, AVG(age) from main_table GROUP BY 
id"),
+  false, "main_table_preagg_sum", "main_table")
+
+checkPreAggTable(sql("SELECT id, COUNT(age) FROM main_table GROUP BY 
id"),
+  true, "main_table_preagg_count")
+checkPreAggTable(sql("SELECT id, COUNT(age) FROM main_table GROUP BY 
id"),
+  false, "main_table_preagg_sum", "main_table")
+
+checkPreAggTable(sql("SELECT id, MIN(age) FROM main_table GROUP BY 
id"),
+  true, "main_table_preagg_min")
+checkPreAggTable(sql("SELECT id, MIN(age) FROM main_table GROUP BY 
id"),
+  false, "main_table_preagg_sum", "main_table")
+
+checkPreAggTable(sql("SELECT id, MAX(age) FROM main_table GROUP BY 
id"),
+  true, "main_table_preagg_max")
+checkPreAggTable(sql("SELECT id, MAX(age) FROM main_table GROUP BY 
id"),
+  false, "main_table_preagg_sum", "main_table")
+
+// sub query should match pre-aggregate table
+checkPreAggTable(sql("SELECT SUM(age) FROM main_table"),
+  true, "main_table_preagg_sum")
+checkPreAggTable(sql("SELECT SUM(age) FROM main_table"),
+  false, "main_table_preagg_avg", "main_table")
+
+checkPreAggTable(sql("SELECT AVG(age) FROM main_table GROUP BY id"),
+  true, "main_table_preagg_avg")
+checkPreAggTable(sql("SELECT AVG(age) from main_table GROUP BY id"),
+  false, "main_table_preagg_sum", "main_table")
+
+checkPreAggTable(sql("SELECT COUNT(age) FROM main_table GROUP BY id"),
+  true, "main_table_preagg_count")
+checkPreAggTable(sql("SELECT COUNT(age) FROM main_table GROUP BY id"),
+  false, "main_table_preagg_sum", "main_table")
+
+checkPreAggTable(sql("SELECT MIN(age) FROM main_table GROUP BY id"),
+  true, "main_table_preagg_min")
+checkPreAggTable(sql("SELECT MIN(age) FROM main_table GROUP BY id"),
+  false, "main_table_preagg_sum", "main_table")
+
+checkPreAggTable(sql("SELECT MAX(age) FROM main_table GROUP BY id"),
+  true, "main_table_preagg_max")
+checkPreAggTable(sql("SELECT MAX(age) FROM main_table GROUP BY id"),
+  false, "main_table_preagg_sum", "main_table")
+  }
+
+  test("test load into main table with pre-aggregate table: double") {
+sql(
+  """
+| CREATE TABLE main_table(
+| id INT,
+| name STRING,
+| city STRING,
+| age DOUBLE)
+| STORED BY 'org.apache.carbondata.format'
+  """.stripMargin)
+
+createAllAggregateTables("main_table")
+sql(s"LOAD DATA LOCAL INPATH '$testData' INTO TABLE ma

[GitHub] carbondata issue #1937: [CARBONDATA-2137] Delete query performance improved

2018-02-08 Thread rahulforallp

Github user rahulforallp commented on the issue:

https://github.com/apache/carbondata/pull/1937
  
@sraghunandan Performance report is added in comment section.


---

[GitHub] carbondata issue #1951: [CARBONDATA-1763] Dropped table if exception thrown ...

2018-02-08 Thread kunal642

Github user kunal642 commented on the issue:

https://github.com/apache/carbondata/pull/1951
  
@ravipesala Build success



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1867
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3588/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1867
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2350/



---

[GitHub] carbondata issue #1857: [CARBONDATA-2073][CARBONDATA-1516][Tests] Add test c...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1857
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2351/



---

[GitHub] carbondata issue #1941: [CARBONDATA1506] fix SDV error in PushUP_FILTER_uniq...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1941
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/3428/



---

[GitHub] carbondata issue #1857: [CARBONDATA-2073][CARBONDATA-1516][Tests] Add test c...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1857
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3589/



---

[GitHub] carbondata issue #1825: [CARBONDATA-2032][DataLoad] directly write carbon da...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1825
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/3429/



---

[GitHub] carbondata issue #1952: [HotFix][CheckStyle] Fix import related checkstyle

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1952
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/3430/



---

[GitHub] carbondata issue #1951: [CARBONDATA-1763] Dropped table if exception thrown ...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1951
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/3431/



---

[jira] [Created] (CARBONDATA-2147) Exception displays while loading data with streaming

2018-02-08 Thread Vandana Yadav (JIRA)

Vandana Yadav created CARBONDATA-2147:
-

 Summary: Exception displays while loading data with streaming
 Key: CARBONDATA-2147
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2147
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 1.3.0
 Environment: spark 2.1, spark 2.2.1
Reporter: Vandana Yadav


Exception displays while loading data with streaming

Steps to reproduce:

1) start spark-shell:

./spark-shell --jars 
/opt/spark/spark-2.2.1/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar

2) Execute following script:

import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.CarbonSession._
import org.apache.carbondata.core.util.CarbonProperties
import org.apache.spark.sql.streaming.\{ProcessingTime, StreamingQuery}

val carbon = SparkSession.builder().config(sc.getConf) 
.getOrCreateCarbonSession("hdfs://localhost:54310/newCarbonStore","/tmp")

import org.apache.carbondata.core.constants.CarbonCommonConstants
import org.apache.carbondata.core.util.CarbonProperties
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION,
 "FORCE")

carbon.sql("drop table if exists uniqdata_stream")

carbon.sql("create table uniqdata_stream(CUST_ID int,CUST_NAME String,DOB 
timestamp,DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10),DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ('TABLE_BLOCKSIZE'= 
'256 MB', 'streaming'='true')");
import carbon.sqlContext.implicits._

import org.apache.spark.sql.types._
val uniqdataSch = StructType(
Array(StructField("CUST_ID", IntegerType),StructField("CUST_NAME", 
StringType),StructField("DOB", TimestampType), StructField("DOJ", 
TimestampType), StructField("BIGINT_COLUMN1", LongType), 
StructField("BIGINT_COLUMN2", LongType), StructField("DECIMAL_COLUMN1", 
org.apache.spark.sql.types.DecimalType(30, 10)), StructField("DECIMAL_COLUMN2", 
org.apache.spark.sql.types.DecimalType(36,10)), StructField("Double_COLUMN1", 
DoubleType), StructField("Double_COLUMN2", DoubleType), 
StructField("INTEGER_COLUMN1", IntegerType)))

val streamDf = carbon.readStream
.schema(uniqdataSch)
.option("sep", ",")
.csv("file:///home/knoldus/Documents/uniqdata")
val qry = streamDf.writeStream.format("carbondata").trigger(ProcessingTime("5 
seconds"))
 .option("checkpointLocation","/stream/uniq")
 .option("dbName", "default")
 .option("tableName", "uniqdata_stream")
 .start()

 

3) Error logs:

warning: there was one deprecation warning; re-run with -deprecation for details
uniqdataSch: org.apache.spark.sql.types.StructType = 
StructType(StructField(CUST_ID,IntegerType,true), 
StructField(CUST_NAME,StringType,true), StructField(DOB,TimestampType,true), 
StructField(DOJ,TimestampType,true), StructField(BIGINT_COLUMN1,LongType,true), 
StructField(BIGINT_COLUMN2,LongType,true), 
StructField(DECIMAL_COLUMN1,DecimalType(30,10),true), 
StructField(DECIMAL_COLUMN2,DecimalType(36,10),true), 
StructField(Double_COLUMN1,DoubleType,true), 
StructField(Double_COLUMN2,DoubleType,true), 
StructField(INTEGER_COLUMN1,IntegerType,true))
streamDf: org.apache.spark.sql.DataFrame = [CUST_ID: int, CUST_NAME: string ... 
9 more fields]
qry: org.apache.spark.sql.streaming.StreamingQuery = 
org.apache.spark.sql.execution.streaming.StreamingQueryWrapper@d0e155c

scala> 18/02/08 16:38:53 ERROR StreamSegment: Executor task launch worker for 
task 5 Failed to append batch data to stream segment: 
hdfs://localhost:54310/newCarbonStore/default/uniqdata_stream1/Fact/Part0/Segment_0
java.lang.NullPointerException
 at org.apache.spark.sql.catalyst.InternalRow.getString(InternalRow.scala:32)
 at 
org.apache.carbondata.streaming.parser.CSVStreamParserImp.parserRow(CSVStreamParserImp.java:40)
 at 
org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$InputIterator.next(CarbonAppendableStreamSink.scala:337)
 at 
org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$InputIterator.next(CarbonAppendableStreamSink.scala:331)
 at 
org.apache.carbondata.streaming.segment.StreamSegment.appendBatchData(StreamSegment.java:244)
 at 
org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply$mcV$sp(CarbonAppendableStreamSink.scala:315)
 at 
org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply(CarbonAppendableStreamSink.scala:305)
 at 
org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply(CarbonAppendableStreamSink.scala:305)
 at 
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1371)
 at 
org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendable

[GitHub] carbondata pull request #1954: [Documentation] Formatting issue fixed

2018-02-08 Thread jatin9896

GitHub user jatin9896 opened a pull request:

https://github.com/apache/carbondata/pull/1954

[Documentation] Formatting issue fixed

Updated document syntax of which pdf generation was failing
Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed? No
 
 - [ ] Any backward compatibility impacted? No
 
 - [ ] Document update required? No

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jatin9896/incubator-carbondata DocumentUpdate

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1954.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1954


commit e4cb62cedc6a48a6a2b10f11635ea561a4453ca1
Author: Jatin 
Date:   2018-02-08T10:55:14Z

updated data-management for pdf generation




---

[GitHub] carbondata issue #1951: [CARBONDATA-1763] Dropped table if exception thrown ...

2018-02-08 Thread kunal642

Github user kunal642 commented on the issue:

https://github.com/apache/carbondata/pull/1951
  
retest sdv please


---

[GitHub] carbondata issue #1904: [CARBONDATA-2059] - Changes to support compaction fo...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1904
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2352/



---

[GitHub] carbondata issue #1904: [CARBONDATA-2059] - Changes to support compaction fo...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1904
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3590/



---

[jira] [Commented] (CARBONDATA-2147) Exception displays while loading data with streaming

2018-02-08 Thread Zhichao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356870#comment-16356870
 ] 

Zhichao  Zhang commented on CARBONDATA-2147:


[~Vandana7] I can resolve this issue, the default parser 'CSVStreamParserImp' 
will cause this problem.

> Exception displays while loading data with streaming
> 
>
> Key: CARBONDATA-2147
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2147
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: spark 2.1, spark 2.2.1
>Reporter: Vandana Yadav
>Priority: Minor
>
> Exception displays while loading data with streaming
> Steps to reproduce:
> 1) start spark-shell:
> ./spark-shell --jars 
> /opt/spark/spark-2.2.1/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar
> 2) Execute following script:
> import org.apache.spark.sql.SparkSession
> import org.apache.spark.sql.CarbonSession._
> import org.apache.carbondata.core.util.CarbonProperties
> import org.apache.spark.sql.streaming.\{ProcessingTime, StreamingQuery}
> val carbon = SparkSession.builder().config(sc.getConf) 
> .getOrCreateCarbonSession("hdfs://localhost:54310/newCarbonStore","/tmp")
> import org.apache.carbondata.core.constants.CarbonCommonConstants
> import org.apache.carbondata.core.util.CarbonProperties
> CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION,
>  "FORCE")
> carbon.sql("drop table if exists uniqdata_stream")
> carbon.sql("create table uniqdata_stream(CUST_ID int,CUST_NAME String,DOB 
> timestamp,DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10),DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES 
> ('TABLE_BLOCKSIZE'= '256 MB', 'streaming'='true')");
> import carbon.sqlContext.implicits._
> import org.apache.spark.sql.types._
> val uniqdataSch = StructType(
> Array(StructField("CUST_ID", IntegerType),StructField("CUST_NAME", 
> StringType),StructField("DOB", TimestampType), StructField("DOJ", 
> TimestampType), StructField("BIGINT_COLUMN1", LongType), 
> StructField("BIGINT_COLUMN2", LongType), StructField("DECIMAL_COLUMN1", 
> org.apache.spark.sql.types.DecimalType(30, 10)), 
> StructField("DECIMAL_COLUMN2", 
> org.apache.spark.sql.types.DecimalType(36,10)), StructField("Double_COLUMN1", 
> DoubleType), StructField("Double_COLUMN2", DoubleType), 
> StructField("INTEGER_COLUMN1", IntegerType)))
> val streamDf = carbon.readStream
> .schema(uniqdataSch)
> .option("sep", ",")
> .csv("file:///home/knoldus/Documents/uniqdata")
> val qry = streamDf.writeStream.format("carbondata").trigger(ProcessingTime("5 
> seconds"))
>  .option("checkpointLocation","/stream/uniq")
>  .option("dbName", "default")
>  .option("tableName", "uniqdata_stream")
>  .start()
>  
> 3) Error logs:
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> uniqdataSch: org.apache.spark.sql.types.StructType = 
> StructType(StructField(CUST_ID,IntegerType,true), 
> StructField(CUST_NAME,StringType,true), StructField(DOB,TimestampType,true), 
> StructField(DOJ,TimestampType,true), 
> StructField(BIGINT_COLUMN1,LongType,true), 
> StructField(BIGINT_COLUMN2,LongType,true), 
> StructField(DECIMAL_COLUMN1,DecimalType(30,10),true), 
> StructField(DECIMAL_COLUMN2,DecimalType(36,10),true), 
> StructField(Double_COLUMN1,DoubleType,true), 
> StructField(Double_COLUMN2,DoubleType,true), 
> StructField(INTEGER_COLUMN1,IntegerType,true))
> streamDf: org.apache.spark.sql.DataFrame = [CUST_ID: int, CUST_NAME: string 
> ... 9 more fields]
> qry: org.apache.spark.sql.streaming.StreamingQuery = 
> org.apache.spark.sql.execution.streaming.StreamingQueryWrapper@d0e155c
> scala> 18/02/08 16:38:53 ERROR StreamSegment: Executor task launch worker for 
> task 5 Failed to append batch data to stream segment: 
> hdfs://localhost:54310/newCarbonStore/default/uniqdata_stream1/Fact/Part0/Segment_0
> java.lang.NullPointerException
>  at org.apache.spark.sql.catalyst.InternalRow.getString(InternalRow.scala:32)
>  at 
> org.apache.carbondata.streaming.parser.CSVStreamParserImp.parserRow(CSVStreamParserImp.java:40)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$InputIterator.next(CarbonAppendableStreamSink.scala:337)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$InputIterator.next(CarbonAppendableStreamSink.scala:331)
>  at 
> org.apache.carbondata.streaming.segment.StreamSegment.appendBatchData(StreamSegment.java:244)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply$mcV$sp(CarbonAppendab

[GitHub] carbondata issue #1825: [CARBONDATA-2032][DataLoad] directly write carbon da...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1825
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/3432/



---

[GitHub] carbondata issue #1954: [Documentation] Formatting issue fixed

2018-02-08 Thread sgururajshetty

Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/1954
  
LGTM


---

[GitHub] carbondata issue #1952: [HotFix][CheckStyle] Fix import related checkstyle

2018-02-08 Thread xuchuanyin

Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1952
  
retest this please


---

[GitHub] carbondata issue #1954: [Documentation] Formatting issue fixed

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1954
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3591/



---

[GitHub] carbondata issue #1954: [Documentation] Formatting issue fixed

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1954
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2353/



---

[GitHub] carbondata issue #1941: [CARBONDATA1506] fix SDV error in PushUP_FILTER_uniq...

2018-02-08 Thread xubo245

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1941
  
retest sdv please


---

[GitHub] carbondata issue #1951: [CARBONDATA-1763] Dropped table if exception thrown ...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1951
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/3433/



---

[GitHub] carbondata issue #1952: [HotFix][CheckStyle] Fix import related checkstyle

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1952
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3592/



---

[GitHub] carbondata issue #1952: [HotFix][CheckStyle] Fix import related checkstyle

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1952
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2354/



---

[GitHub] carbondata issue #1935: [CARBONDATA-2134] Prevent implicit column filter lis...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1935
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/3434/



---

[GitHub] carbondata issue #1953: [CARBONDATA-2091][DataLoad] Support specifying sort ...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1953
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/3435/



---

[GitHub] carbondata issue #1857: [CARBONDATA-2073][CARBONDATA-1516][Tests] Add test c...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1857
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3593/



---

[GitHub] carbondata issue #1952: [HotFix][CheckStyle] Fix import related checkstyle

2018-02-08 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1952
  
LGTM


---

[GitHub] carbondata issue #1857: [CARBONDATA-2073][CARBONDATA-1516][Tests] Add test c...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1857
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2355/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

2018-02-08 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1792
  
retest this please


---

[GitHub] carbondata pull request #1808: [CARBONDATA-2023][DataLoad] Add size base blo...

2018-02-08 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1808#discussion_r166936712
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
 ---
@@ -114,4 +114,14 @@
*/
   public static final int MAX_EXTERNAL_DICTIONARY_SIZE = 1000;
 
+  /**
+   * enable block size based block allocation while loading data. By 
default, carbondata assigns
+   * blocks to node based on block number. If this option is set to 
`true`, carbondata will
+   * consider block size first and make sure that all the nodes will 
process almost equal size of
+   * data. This option is especially useful when you encounter skewed data.
+   */
+  @CarbonProperty
+  public static final String ENABLE_CARBON_LOAD_SKEWED_DATA_OPTIMIZATION
+  = "carbon.load.skewed.data.optimization";
--- End diff --

change to `carbon.load.skewedDataOptimization.enabled`


---

[GitHub] carbondata issue #1941: [CARBONDATA1506] fix SDV error in PushUP_FILTER_uniq...

2018-02-08 Thread xubo245

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1941
  
retest sdv please


---

[GitHub] carbondata issue #1941: [CARBONDATA1506] fix SDV error in PushUP_FILTER_uniq...

2018-02-08 Thread xubo245

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1941
  
retest sdv please


---

[GitHub] carbondata issue #1941: [CARBONDATA1506] fix SDV error in PushUP_FILTER_uniq...

2018-02-08 Thread xubo245

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1941
  
retest sdv please


---

[GitHub] carbondata issue #1941: [CARBONDATA1506] fix SDV error in PushUP_FILTER_uniq...

2018-02-08 Thread xubo245

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1941
  
retest sdv please


---

[GitHub] carbondata issue #1941: [CARBONDATA1506] fix SDV error in PushUP_FILTER_uniq...

2018-02-08 Thread xubo245

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1941
  
retest sdv please


---

[GitHub] carbondata issue #1941: [CARBONDATA1506] fix SDV error in PushUP_FILTER_uniq...

2018-02-08 Thread xubo245

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1941
  
retest sdv please


---

[GitHub] carbondata issue #1941: [CARBONDATA1506] fix SDV error in PushUP_FILTER_uniq...

2018-02-08 Thread xubo245

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1941
  
retest sdv please


---

[GitHub] carbondata issue #1941: [CARBONDATA1506] fix SDV error in PushUP_FILTER_uniq...

2018-02-08 Thread xubo245

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1941
  
retest sdv please


---

[GitHub] carbondata issue #1941: [CARBONDATA1506] fix SDV error in PushUP_FILTER_uniq...

2018-02-08 Thread xubo245

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1941
  
retest sdv please


---

[GitHub] carbondata issue #1941: [CARBONDATA1506] fix SDV error in PushUP_FILTER_uniq...

2018-02-08 Thread xubo245

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1941
  
retest sdv please


---

[GitHub] carbondata issue #1941: [CARBONDATA1506] fix SDV error in PushUP_FILTER_uniq...

2018-02-08 Thread xubo245

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1941
  
retest sdv please


---

[GitHub] carbondata issue #1941: [CARBONDATA1506] fix SDV error in PushUP_FILTER_uniq...

2018-02-08 Thread xubo245

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1941
  
retest sdv please


---

[GitHub] carbondata issue #1941: [CARBONDATA1506] fix SDV error in PushUP_FILTER_uniq...

2018-02-08 Thread xubo245

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1941
  
retest sdv please


---

[GitHub] carbondata issue #1941: [CARBONDATA1506] fix SDV error in PushUP_FILTER_uniq...

2018-02-08 Thread xubo245

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1941
  
retest sdv please


---

[GitHub] carbondata issue #1941: [CARBONDATA1506] fix SDV error in PushUP_FILTER_uniq...

2018-02-08 Thread xubo245

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1941
  
retest sdv please


---

[GitHub] carbondata issue #1949: [CARBONDATA2144] Optimize preaggregate table documen...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1949
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/3436/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1792
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3594/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1792
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2356/



---

[GitHub] carbondata issue #1943: [CARBONDATA-2142] Fixed Pre-Aggregate datamap creati...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1943
  
LGTM


---

[GitHub] carbondata issue #1857: [CARBONDATA-2073][CARBONDATA-1516][Tests] Add test c...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1857
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3595/



---

[GitHub] carbondata issue #1857: [CARBONDATA-2073][CARBONDATA-1516][Tests] Add test c...

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1857
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2357/



---

[GitHub] carbondata pull request #1943: [CARBONDATA-2142] Fixed Pre-Aggregate datamap...

2018-02-08 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1943


---

[GitHub] carbondata issue #1857: [CARBONDATA-2073][CARBONDATA-1516][Tests] Add test c...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1857
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/3437/



---

[GitHub] carbondata issue #1867: [CARBONDATA-2055][Streaming][WIP]Support integrating...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1867
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/3438/



---

[GitHub] carbondata pull request #1955: [HOTFIX] Fix documentation errors.

2018-02-08 Thread sraghunandan

GitHub user sraghunandan opened a pull request:

https://github.com/apache/carbondata/pull/1955

[HOTFIX] Fix documentation errors.

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [x] Any interfaces changed?
 No
 - [x] Any backward compatibility impacted?
 No
 - [x] Document update required?
Yes
 - [x] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
NA   
 - [x] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
NA


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sraghunandan/carbondata-1 
make_doc_example_simple

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1955.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1955


commit 47e2396707c861774feb0a5f993038ad79ddc933
Author: Raghunandan S 
Date:   2018-02-08T16:00:03Z

[HOTFIX] Fix documentation errors.




---

[GitHub] carbondata issue #1857: [CARBONDATA-2073][CARBONDATA-1516][Tests] Add test c...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1857
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/3439/



---

[GitHub] carbondata pull request #1955: [HOTFIX] Fix documentation errors.

2018-02-08 Thread chenliang613

Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1955#discussion_r166984885
  
--- Diff: docs/data-management-on-carbondata.md ---
@@ -955,7 +947,7 @@ roll-up for the queries on these hierarchies.
   USING "timeseries"
   DMPROPERTIES (
   'event_timeâ=âorder_timeâ,
-  'year_granualrityâ=â1â,
+  'year_granularityâ=â1â,
--- End diff --

please remove ","


---

[GitHub] carbondata issue #1955: [HOTFIX] Fix documentation errors.

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1955
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3597/



---

[GitHub] carbondata issue #1904: [CARBONDATA-2059] - Changes to support compaction fo...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1904
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/3440/



---

[GitHub] carbondata issue #1955: [HOTFIX] Fix documentation errors.

2018-02-08 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1955
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2359/



---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

2018-02-08 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1792
  
LGTM


---

[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

2018-02-08 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1792
  
merged into carbonstore branch


---

[GitHub] carbondata issue #1952: [HotFix][CheckStyle] Fix import related checkstyle

2018-02-08 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1952
  
merged into carbonstore branch


---

[GitHub] carbondata issue #1928: [MINOR]Remove dependency of Java 1.8

2018-02-08 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1928
  
LGTM


---

[GitHub] carbondata pull request #1928: [MINOR]Remove dependency of Java 1.8

2018-02-08 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1928


---

[GitHub] carbondata issue #1808: [CARBONDATA-2023][DataLoad] Add size base block allo...

2018-02-08 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1808
  
retest this please


---

[GitHub] carbondata issue #1947: [CARBONDATA-2119]deserialization issue for carbonloa...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1947
  
LGTM


---

[GitHub] carbondata pull request #1947: [CARBONDATA-2119]deserialization issue for ca...

2018-02-08 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1947


---

[GitHub] carbondata issue #1948: [CARBONDATA-2143] Fixed query memory leak issue for ...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1948
  
LGTM


---

[jira] [Created] (CARBONDATA-2148) Use Row parser to replace current default parser:CSVStreamParserImp

2018-02-08 Thread Zhichao Zhang (JIRA)

Zhichao  Zhang created CARBONDATA-2148:
--

 Summary: Use Row parser to replace current default 
parser:CSVStreamParserImp
 Key: CARBONDATA-2148
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2148
 Project: CarbonData
  Issue Type: Improvement
  Components: data-load, spark-integration
Affects Versions: 1.3.0
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.3.0


Currently the default value of 'carbon.stream.parser' is CSVStreamParserImp, it 
transforms InternalRow(0) to Array[Object], InternalRow(0) represents the value 
of one line which is received from Socket. When it receives data from Kafka, 
the schema of InternalRow is changed, either it need to assemble the fields of 
kafka data Row into a String and stored it as InternalRow(0), or define a new 
parser to convert kafka data Row to Array[Object]. It needs the same operation 
for every table.

*Solution:*
Use a new parser called RowStreamParserImpl as the default parser instead of 
CSVStreamParserImpl, this new parser will automatically convert InternalRow to 
Array[Object] according to the schema. In general, we will transform source 
data to a structed Row object, using this way, we do not need to define a 
parser for every table.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (CARBONDATA-2149) Displayed complex type data is error when use DataFrame to write complex type data.

2018-02-08 Thread Zhichao Zhang (JIRA)

Zhichao  Zhang created CARBONDATA-2149:
--

 Summary: Displayed complex type data is error when use DataFrame 
to write complex type data.
 Key: CARBONDATA-2149
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2149
 Project: CarbonData
  Issue Type: Bug
  Components: data-load, spark-integration
Affects Versions: 1.3.0
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.3.0


The default value of 'complex_delimiter_level_1' and 
'complex_delimiter_level_2' is wrong, it must be '$' and ':', not be '\\$' and 
'\\:'.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] carbondata pull request #1948: [CARBONDATA-2143] Fixed query memory leak iss...

2018-02-08 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1948


---

[GitHub] carbondata issue #1951: [CARBONDATA-1763] Dropped table if exception thrown ...

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1951
  
@kunal642 Please rebase


---

[GitHub] carbondata issue #1928: [MINOR]Remove dependency of Java 1.8

2018-02-08 Thread zzcclp

Github user zzcclp commented on the issue:

https://github.com/apache/carbondata/pull/1928
  
@jackylk should this pr be merged into branch-1.3?


---

[GitHub] carbondata issue #1954: [Documentation] Formatting issue fixed

2018-02-08 Thread ravipesala

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1954
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/3441/



---

1 2 3 >

1 - 100 of 217 matches

Mail list logo