date:20191103

[GitHub] [carbondata] jackylk opened a new pull request #3431: [CARBONDATA-3566] Support add segment for partition table

2019-11-03 Thread GitBox

jackylk opened a new pull request #3431: [CARBONDATA-3566] Support add segment 
for partition table
URL: https://github.com/apache/carbondata/pull/3431
 
 
   CarbonData supports ADD SEGMENT for non-partition table already, it should 
also support for Hive partition table. 
   
   This PR supports:
   ```
   create table parquet_table(value int, name string, age int) using parquet 
partitioned by (name, age);
   
   create table carbon_table(value int) partitioned by (name string, age int) 
stored as carbondata;
   
   insert into parquet_table values (30, 'amy', 12), (40, 'bob', 13);
   
   alter table carbon_table add segment options ('path'='$parquetRootPath', 
'format'='parquet', 'partition'='name:string,age:int');
   
   select * from carbon_table
   ```
   
- [ ] Any interfaces changed?

- [ ] Any backward compatibility impacted?

- [ ] Document update required?
   
- [ ] Testing done
   Please provide details on 
   - Whether new unit test cases have been added or why no new tests 
are required?
   - How it is tested? Please attach test report.
   - Is it a performance related change? Please attach the performance 
test report.
   - Any additional information to help reviewers in testing this 
change.
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (CARBONDATA-3566) Support add segment for partition table

2019-11-03 Thread Jacky Li (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li updated CARBONDATA-3566:
-
Description: CarbonData supports ADD SEGMENT for non-partition table 
already, it should also support for Hive partition table.   (was: CarbonData 
supports ADD SEGMENT for non-partition table, it should support for Hive 
partition table. )

> Support add segment for partition table
> ---
>
> Key: CARBONDATA-3566
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3566
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Priority: Major
> Fix For: 2.0.0
>
>
> CarbonData supports ADD SEGMENT for non-partition table already, it should 
> also support for Hive partition table. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (CARBONDATA-3566) Support add segment for partition table

2019-11-03 Thread Jacky Li (Jira)

Jacky Li created CARBONDATA-3566:


 Summary: Support add segment for partition table
 Key: CARBONDATA-3566
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3566
 Project: CarbonData
  Issue Type: Improvement
Reporter: Jacky Li
 Fix For: 2.0.0


CarbonData supports ADD SEGMENT for non-partition table, it should support for 
Hive partition table. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] CarbonDataQA commented on issue #3384: [CARBONDATA-3556] Added testcases for Insert into Complex data type of all Primitive types with 2 levels

2019-11-03 Thread GitBox

CarbonDataQA commented on issue #3384: [CARBONDATA-3556] Added testcases for 
Insert into Complex data type of all Primitive types with 2 levels
URL: https://github.com/apache/carbondata/pull/3384#issuecomment-549246991
 
 
   Build Success with Spark 2.3.2, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/736/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA commented on issue #3384: [CARBONDATA-3556] Added testcases for Insert into Complex data type of all Primitive types with 2 levels

2019-11-03 Thread GitBox

CarbonDataQA commented on issue #3384: [CARBONDATA-3556] Added testcases for 
Insert into Complex data type of all Primitive types with 2 levels
URL: https://github.com/apache/carbondata/pull/3384#issuecomment-549245890
 
 
   Build Success with Spark 2.2.1, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/735/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA commented on issue #3426: [CARBONDATA-3560] Fixed issues for Add Segment

2019-11-03 Thread GitBox

CarbonDataQA commented on issue #3426: [CARBONDATA-3560] Fixed issues for Add 
Segment
URL: https://github.com/apache/carbondata/pull/3426#issuecomment-549245288
 
 
   Build Success with Spark 2.1.0, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/730/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA commented on issue #3430: [CARBONDATA-3565] Fix complex binary data broken issue when loading dataframe data

2019-11-03 Thread GitBox

CarbonDataQA commented on issue #3430: [CARBONDATA-3565] Fix complex binary 
data broken issue when loading dataframe data
URL: https://github.com/apache/carbondata/pull/3430#issuecomment-549243514
 
 
   Build Success with Spark 2.3.2, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/735/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA commented on issue #3430: [CARBONDATA-3565] Fix complex binary data broken issue when loading dataframe data

2019-11-03 Thread GitBox

CarbonDataQA commented on issue #3430: [CARBONDATA-3565] Fix complex binary 
data broken issue when loading dataframe data
URL: https://github.com/apache/carbondata/pull/3430#issuecomment-549242378
 
 
   Build Success with Spark 2.2.1, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/734/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA commented on issue #3407: [CARBONDATA-3542] Support Map data type reading through Hive

2019-11-03 Thread GitBox

CarbonDataQA commented on issue #3407: [CARBONDATA-3542] Support Map data type 
reading through Hive
URL: https://github.com/apache/carbondata/pull/3407#issuecomment-549240457
 
 
   Build Success with Spark 2.1.0, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/729/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] manishnalla1994 commented on a change in pull request #3426: [CARBONDATA-3560] Fixed issues for Add Segment

2019-11-03 Thread GitBox

manishnalla1994 commented on a change in pull request #3426: [CARBONDATA-3560] 
Fixed issues for Add Segment
URL: https://github.com/apache/carbondata/pull/3426#discussion_r341915293
 
 

 ##
 File path: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAddLoadCommand.scala
 ##
 @@ -92,6 +92,22 @@ case class CarbonAddLoadCommand(
 val segmentPath = options.getOrElse(
   "path", throw new UnsupportedOperationException("PATH is manadatory"))
 
+val format = options.getOrElse("format", "carbondata")
+val isCarbonFormat = format.equals("carbondata") || format.equals("carbon")
+
+val allSegments = 
SegmentStatusManager.readLoadMetadata(carbonTable.getMetadataPath)
+
+for (currSegment <- allSegments) {
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] manishnalla1994 commented on a change in pull request #3426: [CARBONDATA-3560] Fixed issues for Add Segment

2019-11-03 Thread GitBox

manishnalla1994 commented on a change in pull request #3426: [CARBONDATA-3560] 
Fixed issues for Add Segment
URL: https://github.com/apache/carbondata/pull/3426#discussion_r341914568
 
 

 ##
 File path: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAddLoadCommand.scala
 ##
 @@ -92,6 +92,22 @@ case class CarbonAddLoadCommand(
 val segmentPath = options.getOrElse(
   "path", throw new UnsupportedOperationException("PATH is manadatory"))
 
+val format = options.getOrElse("format", "carbondata")
+val isCarbonFormat = format.equals("carbondata") || format.equals("carbon")
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA commented on issue #3426: [CARBONDATA-3560] Fixed issues for Add Segment

2019-11-03 Thread GitBox

CarbonDataQA commented on issue #3426: [CARBONDATA-3560] Fixed issues for Add 
Segment
URL: https://github.com/apache/carbondata/pull/3426#issuecomment-549233999
 
 
   Build Success with Spark 2.3.2, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/733/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA commented on issue #3393: [CARBONDATA-3503][WIP][Carbon2] Adapt to SparkSessionExtension

2019-11-03 Thread GitBox

CarbonDataQA commented on issue #3393: [CARBONDATA-3503][WIP][Carbon2] Adapt to 
SparkSessionExtension
URL: https://github.com/apache/carbondata/pull/3393#issuecomment-549233783
 
 
   Build Failed  with Spark 2.3.2, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/734/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA commented on issue #3426: [CARBONDATA-3560] Fixed issues for Add Segment

2019-11-03 Thread GitBox

CarbonDataQA commented on issue #3426: [CARBONDATA-3560] Fixed issues for Add 
Segment
URL: https://github.com/apache/carbondata/pull/3426#issuecomment-549233681
 
 
   Build Success with Spark 2.2.1, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/732/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA commented on issue #3393: [CARBONDATA-3503][WIP][Carbon2] Adapt to SparkSessionExtension

2019-11-03 Thread GitBox

CarbonDataQA commented on issue #3393: [CARBONDATA-3503][WIP][Carbon2] Adapt to 
SparkSessionExtension
URL: https://github.com/apache/carbondata/pull/3393#issuecomment-549233304
 
 
   Build Failed with Spark 2.2.1, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/733/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA commented on issue #3384: [CARBONDATA-3556] Added testcases for Insert into Complex data type of all Primitive types with 2 levels

2019-11-03 Thread GitBox

CarbonDataQA commented on issue #3384: [CARBONDATA-3556] Added testcases for 
Insert into Complex data type of all Primitive types with 2 levels
URL: https://github.com/apache/carbondata/pull/3384#issuecomment-549231942
 
 
   Build Success with Spark 2.1.0, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/728/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA commented on issue #3430: [CARBONDATA-3565] Fix complex binary data broken issue when loading dataframe data

2019-11-03 Thread GitBox

CarbonDataQA commented on issue #3430: [CARBONDATA-3565] Fix complex binary 
data broken issue when loading dataframe data
URL: https://github.com/apache/carbondata/pull/3430#issuecomment-549231539
 
 
   Build Success with Spark 2.1.0, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/727/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat commented on issue #3407: [CARBONDATA-3542] Support Map data type reading through Hive

2019-11-03 Thread GitBox

ajantha-bhat commented on issue #3407: [CARBONDATA-3542] Support Map data type 
reading through Hive
URL: https://github.com/apache/carbondata/pull/3407#issuecomment-549230382
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat commented on issue #3407: [CARBONDATA-3542] Support Map data type reading through Hive

2019-11-03 Thread GitBox

ajantha-bhat commented on issue #3407: [CARBONDATA-3542] Support Map data type 
reading through Hive
URL: https://github.com/apache/carbondata/pull/3407#issuecomment-549230367
 
 
   LGTM


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat commented on issue #3384: [CARBONDATA-3556] Added testcases for Insert into Complex data type of all Primitive types with 2 levels

2019-11-03 Thread GitBox

ajantha-bhat commented on issue #3384: [CARBONDATA-3556] Added testcases for 
Insert into Complex data type of all Primitive types with 2 levels
URL: https://github.com/apache/carbondata/pull/3384#issuecomment-549229840
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (CARBONDATA-3565) Binary to string issue when loading dataframe data in NewRddIterator

2019-11-03 Thread ChenKai (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChenKai updated CARBONDATA-3565:

Description: 
* issue
Spark DataFrame(SQL) load complex binary data to a hive table, the data will be 
broken when reading out. I see in RddIterator, the data will be converted to a 
string, and then be converted back.

* test case
Binary data can be *DataOutputStream#writeDouble* and so on.

* discussion
I think *CarbonScalaUtil#getString* operation can be removed now. I dig deep 
into the code in 2016, the code was used in kettle *CsvInput* (commit: 
0018756d). But the code has been removed now, I think this converting operation 
is a little redundant. (UPDATE: The follow-up code GenericParser will use this 
string-convert logic, should consider here.)

  was:
* issue
Spark DataFrame(SQL) load complex binary data to a hive table, the data will be 
broken when reading out. I see in RddIterator, the data will be converted to a 
string, and then be converted back.

* test case
Binary data can be *DataOutputStream#writeDouble* and so on.

* discussion
I think *CarbonScalaUtil#getString* operation can be removed now. I dig deep 
into the code in 2016, the code was used in kettle *CsvInput* (commit: 
0018756d). But the code has been removed now, I think this converting operation 
is a little redundant.


> Binary to string issue when loading dataframe data in NewRddIterator
> 
>
> Key: CARBONDATA-3565
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3565
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.6.0
>Reporter: ChenKai
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> * issue
> Spark DataFrame(SQL) load complex binary data to a hive table, the data will 
> be broken when reading out. I see in RddIterator, the data will be converted 
> to a string, and then be converted back.
> * test case
> Binary data can be *DataOutputStream#writeDouble* and so on.
> * discussion
> I think *CarbonScalaUtil#getString* operation can be removed now. I dig deep 
> into the code in 2016, the code was used in kettle *CsvInput* (commit: 
> 0018756d). But the code has been removed now, I think this converting 
> operation is a little redundant. (UPDATE: The follow-up code GenericParser 
> will use this string-convert logic, should consider here.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] IceMimosa commented on issue #3430: [CARBONDATA-3565] Fix complex binary data broken issue when loading dataframe data

2019-11-03 Thread GitBox

IceMimosa commented on issue #3430: [CARBONDATA-3565] Fix complex binary data 
broken issue when loading dataframe data
URL: https://github.com/apache/carbondata/pull/3430#issuecomment-549228640
 
 
   Fix style


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA commented on issue #3426: [CARBONDATA-3560] Fixed issues for Add Segment

2019-11-03 Thread GitBox

CarbonDataQA commented on issue #3426: [CARBONDATA-3560] Fixed issues for Add 
Segment
URL: https://github.com/apache/carbondata/pull/3426#issuecomment-549224753
 
 
   Build Success with Spark 2.1.0, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/725/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA commented on issue #3430: [CARBONDATA-3565] Fix complex binary data broken issue when loading dataframe data

2019-11-03 Thread GitBox

CarbonDataQA commented on issue #3430: [CARBONDATA-3565] Fix complex binary 
data broken issue when loading dataframe data
URL: https://github.com/apache/carbondata/pull/3430#issuecomment-549224063
 
 
   Build Failed with Spark 2.2.1, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/731/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA commented on issue #3430: [CARBONDATA-3565] Fix complex binary data broken issue when loading dataframe data

2019-11-03 Thread GitBox

CarbonDataQA commented on issue #3430: [CARBONDATA-3565] Fix complex binary 
data broken issue when loading dataframe data
URL: https://github.com/apache/carbondata/pull/3430#issuecomment-549223949
 
 
   Build Failed  with Spark 2.3.2, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/732/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA commented on issue #3393: [CARBONDATA-3503][WIP][Carbon2] Adapt to SparkSessionExtension

2019-11-03 Thread GitBox

CarbonDataQA commented on issue #3393: [CARBONDATA-3503][WIP][Carbon2] Adapt to 
SparkSessionExtension
URL: https://github.com/apache/carbondata/pull/3393#issuecomment-549223709
 
 
   Build Failed  with Spark 2.1.0, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/726/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA commented on issue #3430: [CARBONDATA-3565] Fix complex binary data broken issue when loading dataframe data

2019-11-03 Thread GitBox

CarbonDataQA commented on issue #3430: [CARBONDATA-3565] Fix complex binary 
data broken issue when loading dataframe data
URL: https://github.com/apache/carbondata/pull/3430#issuecomment-549223532
 
 
   Build Failed  with Spark 2.1.0, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/724/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] manishnalla1994 commented on a change in pull request #3426: [CARBONDATA-3560] Fixed issues for Add Segment

2019-11-03 Thread GitBox

manishnalla1994 commented on a change in pull request #3426: [CARBONDATA-3560] 
Fixed issues for Add Segment
URL: https://github.com/apache/carbondata/pull/3426#discussion_r341902454
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
 ##
 @@ -184,6 +184,22 @@ public static String writeSegmentFile(CarbonTable 
carbonTable, String segmentId,
 return writeSegmentFile(carbonTable, segmentId, UUID, null, segPath);
   }
 
+  /**
+   * Returns the list of index files
+   *
+   * @param segmentPath
+   * @return
+   */
+  public static CarbonFile[] getListOfCarbonIndexFiles(String segmentPath) {
+CarbonFile segmentFolder = FileFactory.getCarbonFile(segmentPath);
+CarbonFile[] indexFiles = segmentFolder.listFiles(new CarbonFileFilter() {
+  @Override public boolean accept(CarbonFile file) {
+return (file.getName().endsWith(CarbonTablePath.INDEX_FILE_EXT) || 
file.getName()
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] manishnalla1994 commented on a change in pull request #3426: [CARBONDATA-3560] Fixed issues for Add Segment

2019-11-03 Thread GitBox

manishnalla1994 commented on a change in pull request #3426: [CARBONDATA-3560] 
Fixed issues for Add Segment
URL: https://github.com/apache/carbondata/pull/3426#discussion_r341902473
 
 

 ##
 File path: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAddLoadCommand.scala
 ##
 @@ -92,6 +92,22 @@ case class CarbonAddLoadCommand(
 val segmentPath = options.getOrElse(
   "path", throw new UnsupportedOperationException("PATH is manadatory"))
 
+val format = options.getOrElse("format", "carbondata")
+val isCarbonFormat = format.equals("carbondata") || format.equals("carbon")
+
+val allSegments = 
SegmentStatusManager.readLoadMetadata(carbonTable.getMetadataPath)
+
+for (currSegment <- allSegments) {
+  if (currSegment.getPath != null && 
currSegment.getPath.equalsIgnoreCase(segmentPath)) {
+throw new AnalysisException(s"Cannot add the segment. This path is 
already in use by " +
+s"another segment.")
+  }
+}
+
+if (isCarbonFormat && 
SegmentFileStore.getListOfCarbonIndexFiles(segmentPath).isEmpty) {
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] manishnalla1994 commented on a change in pull request #3426: [CARBONDATA-3560] Fixed issues for Add Segment

2019-11-03 Thread GitBox

manishnalla1994 commented on a change in pull request #3426: [CARBONDATA-3560] 
Fixed issues for Add Segment
URL: https://github.com/apache/carbondata/pull/3426#discussion_r341901800
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
 ##
 @@ -184,6 +184,22 @@ public static String writeSegmentFile(CarbonTable 
carbonTable, String segmentId,
 return writeSegmentFile(carbonTable, segmentId, UUID, null, segPath);
   }
 
+  /**
+   * Returns the list of index files
+   *
+   * @param segmentPath
+   * @return
+   */
+  public static CarbonFile[] getListOfCarbonIndexFiles(String segmentPath) {
 
 Review comment:
   Yes, its the path to segment folder to get the carbon file.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] manishnalla1994 commented on a change in pull request #3427: [CARBONDATA-3562] Fix for SDK filter queries not working when schema is given explicitly while Add Segment

2019-11-03 Thread GitBox

manishnalla1994 commented on a change in pull request #3427: [CARBONDATA-3562] 
Fix for SDK filter queries not working when schema is given explicitly while 
Add Segment
URL: https://github.com/apache/carbondata/pull/3427#discussion_r341901576
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/addsegment/AddSegmentTestCase.scala
 ##
 @@ -722,12 +723,28 @@ class AddSegmentTestCase extends QueryTest with 
BeforeAndAfterAll {
 val externalSegmentPath = storeLocation + "/" + "external_segment"
 FileFactory.deleteAllFilesOfDir(new File(externalSegmentPath))
 
+var fields: Array[Field] = new Array[Field](14)
 
 Review comment:
   This test is changed just to check for the schema which we give externally 
instead of referring to schema file of the already existing table.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] manishnalla1994 commented on a change in pull request #3426: [CARBONDATA-3560] Fixed issues for Add Segment

2019-11-03 Thread GitBox

manishnalla1994 commented on a change in pull request #3426: [CARBONDATA-3560] 
Fixed issues for Add Segment
URL: https://github.com/apache/carbondata/pull/3426#discussion_r341901364
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/addsegment/AddSegmentTestCase.scala
 ##
 @@ -542,7 +541,7 @@ class AddSegmentTestCase extends QueryTest with 
BeforeAndAfterAll {
 copy(path.toString, newPath)
 checkAnswer(sql("select count(*) from addsegment1"), Seq(Row(30)))
 
-sql(s"alter table addsegment1 add segment options('path'='$newPath', 
'format'='parquet')").show()
+sql(s"alter table addsegment1 add segment options('path'='$newPath', 
'format'='PARQUET')").show()
 
 Review comment:
   This change is required just for the testing purpose of uppercase format, so 
that I need not add a new testcase just for case sensitivity.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] brijoobopanna commented on issue #3393: [CARBONDATA-3503][WIP][Carbon2] Adapt to SparkSessionExtension

2019-11-03 Thread GitBox

brijoobopanna commented on issue #3393: [CARBONDATA-3503][WIP][Carbon2] Adapt 
to SparkSessionExtension
URL: https://github.com/apache/carbondata/pull/3393#issuecomment-549216131
 
 
   retest this please
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] brijoobopanna commented on issue #3430: [CARBONDATA-3565] Fix complex binary data broken issue when loading dataframe data

2019-11-03 Thread GitBox

brijoobopanna commented on issue #3430: [CARBONDATA-3565] Fix complex binary 
data broken issue when loading dataframe data
URL: https://github.com/apache/carbondata/pull/3430#issuecomment-549214655
 
 
   add to whitelist
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Resolved] (CARBONDATA-3492) Cache Pre-Priming

2019-11-03 Thread Kunal Kapoor (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor resolved CARBONDATA-3492.
--
Fix Version/s: 2.0.0
   Resolution: Fixed

> Cache Pre-Priming
> -
>
> Key: CARBONDATA-3492
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3492
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Priority: Major
> Fix For: 2.0.0
>
> Attachments: Cache_Pre_Priming_V1.pdf
>
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> Currently, we have an index server which basically helps in distributed 
> caching of the datamaps in a separate spark application.
> The caching of the datamaps in index server will start once the query is 
> fired on the table for the first time, all the datamaps will be loaded
> if the count(*) is fired and only required will be loaded for any filter 
> query.
> Here the problem or the bottleneck is, until and unless the query is fired on 
> table, the caching won’t be done for the table datamaps.
> So consider a scenario where we are just loading the data to table for whole 
> day and then next day we query,
> so all the segments will start loading into cache. So first time the query 
> will be slow.
> What if we load the datamaps into cache or preprime the cache without 
> waititng for any query on the table?
> Yes, what if we load the cache after every load is done, what if we load the 
> cache for all the segments at once,
> so that first time query need not do all this job, which makes it faster.
> Here i have attached the design document for the pre-priming of cache into 
> index server. Please have a look at it



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] asfgit closed pull request #3420: [CARBONDATA-3492] Pre-priming cache

2019-11-03 Thread GitBox

asfgit closed pull request #3420: [CARBONDATA-3492] Pre-priming cache
URL: https://github.com/apache/carbondata/pull/3420
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] kunal642 commented on issue #3420: [CARBONDATA-3492] Pre-priming cache

2019-11-03 Thread GitBox

kunal642 commented on issue #3420: [CARBONDATA-3492] Pre-priming cache
URL: https://github.com/apache/carbondata/pull/3420#issuecomment-549155903
 
 
   LGTM


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

37 matches

Mail list logo