[GitHub] carbondata issue #2321: [WIP]clean and close datamap writers on any task fai...

2018-05-23 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2321
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5063/



---


[jira] [Issue Comment Deleted] (CARBONDATA-2519) Add document for CarbonReader

2018-05-23 Thread xubo245 (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xubo245 updated CARBONDATA-2519:

Comment: was deleted

(was: Add document for CarbonReader, and change the carbon writer guide 
document)

> Add document for CarbonReader
> -
>
> Key: CARBONDATA-2519
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2519
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xubo245
>Assignee: xubo245
>Priority: Major
>
> Add document for CarbonReader, and change the carbon writer guide document



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2519) Add document for CarbonReader

2018-05-23 Thread xubo245 (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xubo245 updated CARBONDATA-2519:

Description: Add document for CarbonReader, and change the carbon writer 
guide document

> Add document for CarbonReader
> -
>
> Key: CARBONDATA-2519
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2519
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xubo245
>Assignee: xubo245
>Priority: Major
>
> Add document for CarbonReader, and change the carbon writer guide document



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2318
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4902/



---


[GitHub] carbondata issue #2308: [WIP]Adding SDV Testcases for SDKwriter

2018-05-23 Thread Indhumathi27
Github user Indhumathi27 commented on the issue:

https://github.com/apache/carbondata/pull/2308
  
retest sdv please


---


[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2318
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6062/



---


[GitHub] carbondata issue #2332: [CARBONDATA-2514] Added condition to check for dupli...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2332
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6066/



---


[GitHub] carbondata pull request #2252: [CARBONDATA-2420] Support string longer than ...

2018-05-23 Thread kumarvishal09
Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2252#discussion_r190150092
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/chunk/reader/dimension/v1/CompressedDimensionChunkFileBasedReaderV1.java
 ---
@@ -99,6 +99,7 @@ public CompressedDimensionChunkFileBasedReaderV1(final 
BlockletInfo blockletInfo
 
   @Override public DimensionColumnPage decodeColumnPage(
   DimensionRawColumnChunk dimensionRawColumnChunk, int pageNumber) 
throws IOException {
+boolean isLongStringColumn = 
dimensionRawColumnChunk.isLongStringColumn();
--- End diff --

No we support V3 while writing the data...v1/v2 is supported only for 
reading to support backward compatibility 


---


[GitHub] carbondata issue #2321: [WIP]clean and close datamap writers on any task fai...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2321
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4900/



---


[GitHub] carbondata issue #2308: [WIP]Adding SDV Testcases for SDKwriter

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2308
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4901/



---


[GitHub] carbondata issue #2321: [WIP]clean and close datamap writers on any task fai...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2321
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6059/



---


[GitHub] carbondata issue #2333: [WIP] Change the query flow while selecting the carb...

2018-05-23 Thread rahulforallp
Github user rahulforallp commented on the issue:

https://github.com/apache/carbondata/pull/2333
  
retest this please


---


[GitHub] carbondata issue #2321: [WIP]clean and close datamap writers on any task fai...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2321
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6060/



---


[GitHub] carbondata issue #2308: [WIP]Adding SDV Testcases for SDKwriter

2018-05-23 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2308
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5065/



---


[GitHub] carbondata issue #2320: [Documentation] Editorial review comment fixed

2018-05-23 Thread kunal642
Github user kunal642 commented on the issue:

https://github.com/apache/carbondata/pull/2320
  
LGTM


---


[jira] [Created] (CARBONDATA-2519) Add document

2018-05-23 Thread xubo245 (JIRA)
xubo245 created CARBONDATA-2519:
---

 Summary: Add document
 Key: CARBONDATA-2519
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2519
 Project: CarbonData
  Issue Type: Improvement
Reporter: xubo245






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata pull request #2318: [CARBONDATA-2491] Fix the error when reader r...

2018-05-23 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2318#discussion_r190151696
  
--- Diff: 
store/sdk/src/test/java/org/apache/carbondata/sdk/file/CarbonReaderTest.java ---
@@ -177,4 +239,134 @@ public void testWriteAndReadFilesNonTransactional() 
throws IOException, Interrup
 reader.close();
 FileUtils.deleteDirectory(new File(path));
   }
+
+  CarbonProperties carbonProperties;
+
+  @Override
+  public void setUp() {
+carbonProperties = CarbonProperties.getInstance();
+  }
+
+  private static final LogService LOGGER =
+  LogServiceFactory.getLogService(CarbonReaderTest.class.getName());
+
+  @Test
+  public void testTimeStampAndBadRecord() throws IOException, 
InterruptedException {
+String timestampFormat = 
carbonProperties.getProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
+CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT);
+String badRecordAction = 
carbonProperties.getProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION,
+CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION_DEFAULT);
+String badRecordLoc = 
carbonProperties.getProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC,
+CarbonCommonConstants.CARBON_BADRECORDS_LOC_DEFAULT_VAL);
+String rootPath = new File(this.getClass().getResource("/").getPath()
++ "../../").getCanonicalPath();
+String storeLocation = rootPath + "/target/";
+carbonProperties
+.addProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC, 
storeLocation)
+.addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, 
"-MM-dd hh:mm:ss")
+.addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, 
"REDIRECT");
+String path = "./testWriteFiles";
+FileUtils.deleteDirectory(new File(path));
+
+Field[] fields = new Field[9];
+fields[0] = new Field("stringField", DataTypes.STRING);
+fields[1] = new Field("intField", DataTypes.INT);
+fields[2] = new Field("shortField", DataTypes.SHORT);
+fields[3] = new Field("longField", DataTypes.LONG);
+fields[4] = new Field("doubleField", DataTypes.DOUBLE);
+fields[5] = new Field("boolField", DataTypes.BOOLEAN);
+fields[6] = new Field("dateField", DataTypes.DATE);
+fields[7] = new Field("timeField", DataTypes.TIMESTAMP);
+fields[8] = new Field("decimalField", DataTypes.createDecimalType(8, 
2));
+
+try {
+  CarbonWriterBuilder builder = CarbonWriter.builder()
+  .isTransactionalTable(true)
+  .persistSchemaFile(true)
+  .outputPath(path);
+
+  CarbonWriter writer = builder.buildWriterForCSVInput(new 
Schema(fields));
+
+  for (int i = 0; i < 100; i++) {
+String[] row = new String[]{
+"robot" + (i % 10),
+String.valueOf(i),
+String.valueOf(i),
+String.valueOf(Long.MAX_VALUE - i),
+String.valueOf((double) i / 2),
+String.valueOf(true),
+"2018-05-12",
+"2018-05-12",
+"12.345"
+};
+writer.write(row);
+String[] row2 = new String[]{
+"robot" + (i % 10),
+String.valueOf(i),
+String.valueOf(i),
+String.valueOf(Long.MAX_VALUE - i),
+String.valueOf((double) i / 2),
+String.valueOf(true),
+"2019-03-02",
+"2019-02-12 03:03:34",
+"12.345"
+};
+writer.write(row2);
+  }
+  writer.close();
+} catch (Exception e) {
+  e.printStackTrace();
+  Assert.fail(e.getMessage());
+}
+LOGGER.audit("Bad record location:" + storeLocation);
+File segmentFolder = new File(CarbonTablePath.getSegmentPath(path, 
"null"));
+Assert.assertTrue(segmentFolder.exists());
+
+File[] dataFiles = segmentFolder.listFiles(new FileFilter() {
+  @Override
+  public boolean accept(File pathname) {
+return 
pathname.getName().endsWith(CarbonCommonConstants.FACT_FILE_EXT);
+  }
+});
+Assert.assertNotNull(dataFiles);
+Assert.assertTrue(dataFiles.length > 0);
+
+CarbonReader reader = CarbonReader.builder(path, "_temp")
+.projection(new String[]{
+"stringField"
+, "shortField"
+, "intField"
+, "longField"
+, "doubleField"
+, "boolField"
+, "dateField"
+, "timeField"
+, "decimalField"}).build();
+
+int i = 0;
+while (reader.hasNext()) {
--- End diff --
  

[GitHub] carbondata pull request #2252: [CARBONDATA-2420] Support string longer than ...

2018-05-23 Thread kumarvishal09
Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2252#discussion_r190161224
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/chunk/impl/DimensionRawColumnChunk.java
 ---
@@ -32,13 +32,14 @@
  *  by specifying page number.
  */
 public class DimensionRawColumnChunk extends AbstractRawColumnChunk {
-
   private DimensionColumnPage[] dataChunks;
 
   private DimensionColumnChunkReader chunkReader;
 
   private FileReader fileReader;
+  private boolean isLongStringColumn;
--- End diff --

how u are deciding it is isLongStringColumn or not ??


---


[GitHub] carbondata pull request #2252: [CARBONDATA-2420] Support string longer than ...

2018-05-23 Thread kumarvishal09
Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2252#discussion_r190160809
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentProperties.java
 ---
@@ -849,7 +852,41 @@ public int getNumberOfDictSortColumns() {
 return this.numberOfSortColumns - this.numberOfNoDictSortColumns;
   }
 
+  public int getNumberOfLongStringColumns() {
+return numberOfLongStringColumns;
+  }
+
   public int getLastDimensionColOrdinal() {
 return lastDimensionColOrdinal;
   }
+
+  @Override public String toString() {
--- End diff --

Why this method is required ??...can u add some comment 


---


[GitHub] carbondata issue #2252: [CARBONDATA-2420] Support string longer than 32000 c...

2018-05-23 Thread kumarvishal09
Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/2252
  
@xuchuanyin how u are deciding isLongStringColumn is true or false in 
query??


---


[jira] [Created] (CARBONDATA-2518) Add a new method for CarbonReader to transform Data and TimeStamp data type

2018-05-23 Thread xubo245 (JIRA)
xubo245 created CARBONDATA-2518:
---

 Summary: Add a new method for CarbonReader to transform Data and 
TimeStamp data type
 Key: CARBONDATA-2518
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2518
 Project: CarbonData
  Issue Type: Improvement
Reporter: xubo245
Assignee: xubo245



org.apache.carbondata.sdk.file.CarbonReader#readNextRow return the int for Date 
and long value for timestamp. It's Inconvenient. We need add a new method for 
CarbonReader to transform Data and TimeStamp data type.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2321: [WIP]clean and close datamap writers on any task fai...

2018-05-23 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2321
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5064/



---


[GitHub] carbondata issue #2308: [WIP]Adding SDV Testcases for SDKwriter

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2308
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6061/



---


[GitHub] carbondata issue #2333: [WIP] Change the query flow while selecting the carb...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2333
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6063/



---


[GitHub] carbondata issue #2336: [CARBONDATA-2521] Support create carbonReader withou...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2336
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4919/



---


[GitHub] carbondata issue #2338: [CARBONDATA-2524] Support create carbonReader with d...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2338
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6082/



---


[GitHub] carbondata pull request #2314: [CARBONDATA-2309][DataLoad] Add strategy to g...

2018-05-23 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2314#discussion_r190455538
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java
 ---
@@ -575,11 +577,12 @@ public static Dictionary 
getDictionary(AbsoluteTableIdentifier absoluteTableIden
* @param noOfNodesInput -1 if number of nodes has to be decided
*   based on block location information
* @param blockAssignmentStrategy strategy used to assign blocks
+   * @param loadMinSize the property load_min_size_inmb specified by the 
user
* @return a map that maps node to blocks
*/
   public static Map nodeBlockMapping(
   List blockInfos, int noOfNodesInput, List 
activeNodes,
-  BlockAssignmentStrategy blockAssignmentStrategy) {
+  BlockAssignmentStrategy blockAssignmentStrategy, String loadMinSize 
) {
--- End diff --

How about changing the name `loadMinSize` to `expectedMinSizePerNode`?


---


[GitHub] carbondata pull request #2314: [CARBONDATA-2309][DataLoad] Add strategy to g...

2018-05-23 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2314#discussion_r190455207
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java
 ---
@@ -596,20 +599,50 @@ public static Dictionary 
getDictionary(AbsoluteTableIdentifier absoluteTableIden
 
 // calculate the average expected size for each node
 long sizePerNode = 0;
+long totalFileSize = 0;
 if (BlockAssignmentStrategy.BLOCK_NUM_FIRST == 
blockAssignmentStrategy) {
   sizePerNode = blockInfos.size() / noofNodes;
   sizePerNode = sizePerNode <= 0 ? 1 : sizePerNode;
-} else if (BlockAssignmentStrategy.BLOCK_SIZE_FIRST == 
blockAssignmentStrategy) {
-  long totalFileSize = 0;
+} else if (BlockAssignmentStrategy.BLOCK_SIZE_FIRST == 
blockAssignmentStrategy
+|| BlockAssignmentStrategy.NODE_MIN_SIZE_FIRST == 
blockAssignmentStrategy) {
   for (Distributable blockInfo : uniqueBlocks) {
 totalFileSize += ((TableBlockInfo) blockInfo).getBlockLength();
   }
   sizePerNode = totalFileSize / noofNodes;
 }
 
-// assign blocks to each node
-assignBlocksByDataLocality(rtnNode2Blocks, sizePerNode, uniqueBlocks, 
originNode2Blocks,
-activeNodes, blockAssignmentStrategy);
+// if enable to control the minimum amount of input data for each node
+if (BlockAssignmentStrategy.NODE_MIN_SIZE_FIRST == 
blockAssignmentStrategy) {
+  long iLoadMinSize = 0;
+  // validate the property load_min_size_inmb specified by the user
+  if (CarbonUtil.validateValidIntType(loadMinSize)) {
+iLoadMinSize = Integer.parseInt(loadMinSize);
+  } else {
+LOGGER.warn("Invalid load_min_size_inmb value found: " + 
loadMinSize
++ ", only int value greater than 0 is supported.");
+iLoadMinSize = CarbonCommonConstants.CARBON_LOAD_MIN_SIZE_DEFAULT;
+  }
+  // If the average expected size for each node greater than load min 
size,
+  // then fall back to default strategy
+  if (iLoadMinSize * 1024 * 1024 < sizePerNode) {
+if 
(CarbonProperties.getInstance().isLoadSkewedDataOptimizationEnabled()) {
+  blockAssignmentStrategy = 
BlockAssignmentStrategy.BLOCK_SIZE_FIRST;
+} else {
+  blockAssignmentStrategy = 
BlockAssignmentStrategy.BLOCK_NUM_FIRST;
+}
+  } else {
--- End diff --

Better to add log
```
LOG.info("Specified minimum data size to load is less than the average size 
for each node, fallback to default strategy" + blockAssignmentStrategy);
```


---


[GitHub] carbondata issue #2314: [CARBONDATA-2309][DataLoad] Add strategy to generate...

2018-05-23 Thread xuchuanyin
Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/2314
  
@ndwangsen please resolve the conflicts and review comments


---


[jira] [Created] (CARBONDATA-2525) Search mode can not register master in security cluster

2018-05-23 Thread xubo245 (JIRA)
xubo245 created CARBONDATA-2525:
---

 Summary: Search mode can not register master in security cluster
 Key: CARBONDATA-2525
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2525
 Project: CarbonData
  Issue Type: Bug
Reporter: xubo245
Assignee: xubo245






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2290: [CARBONDATA-2389] Search mode support lucene datamap

2018-05-23 Thread xubo245
Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/2290
  
Please help to review it. @jackylk @ravipesala @QiangCai


---


[GitHub] carbondata issue #2339: [WIP][CARBONDATA-2525] Search mode can not register ...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2339
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4926/



---


[GitHub] carbondata issue #2339: [WIP][CARBONDATA-2525] Search mode can not register ...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2339
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6087/



---


[GitHub] carbondata issue #2338: [CARBONDATA-2524] Support create carbonReader with d...

2018-05-23 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2338
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5081/



---


[GitHub] carbondata issue #2336: [CARBONDATA-2521] Support create carbonReader withou...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2336
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6079/



---


[GitHub] carbondata issue #2336: [CARBONDATA-2521] Support create carbonReader withou...

2018-05-23 Thread xubo245
Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/2336
  
retest this please


---


[GitHub] carbondata issue #2337: [CARBONDATA-2519] Add document for CarbonReader

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2337
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6083/



---


[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2318
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6084/



---


[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2318
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4921/



---


[GitHub] carbondata pull request #2332: [CARBONDATA-2514] Added condition to check fo...

2018-05-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2332


---


[GitHub] carbondata issue #2337: [CARBONDATA-2519] Add document for CarbonReader

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2337
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4924/



---


[GitHub] carbondata issue #2337: [CARBONDATA-2519] Add document for CarbonReader

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2337
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6085/



---


[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...

2018-05-23 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2318
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5082/



---


[GitHub] carbondata pull request #2339: [WIP][CARBONDATA-2525] Search mode can not re...

2018-05-23 Thread xubo245
GitHub user xubo245 opened a pull request:

https://github.com/apache/carbondata/pull/2339

[WIP][CARBONDATA-2525] Search mode can not register master in security 
cluster


Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xubo245/carbondata 
CARBONDATA-2525-registerMaster

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2339.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2339


commit c9962df700a4dfa3b7e11e6293a7bd851601cda5
Author: xubo245 
Date:   2018-05-24T03:59:14Z

[CARBONDATA-2525] Search mode can not register master in security cluster




---


[GitHub] carbondata issue #2338: [CARBONDATA-2524] Support create carbonReader with d...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2338
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4922/



---


[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...

2018-05-23 Thread xubo245
Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/2318
  
retest this please


---


[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2318
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4925/



---


[GitHub] carbondata issue #2338: [CARBONDATA-2524] Support create carbonReader with d...

2018-05-23 Thread xubo245
Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/2338
  
retest this please


---


[GitHub] carbondata issue #2337: [CARBONDATA-2519] Add document for CarbonReader

2018-05-23 Thread xubo245
Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/2337
  
retest this please


---


[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2318
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6081/



---


[GitHub] carbondata issue #2338: [CARBONDATA-2524] Support create carbonReader with d...

2018-05-23 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2338
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5083/



---


[jira] [Resolved] (CARBONDATA-2514) Duplicate columns in CarbonWriter is throwing NullPointerException

2018-05-23 Thread Manish Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2514.
--
   Resolution: Fixed
 Assignee: Kunal Kapoor
Fix Version/s: 1.4.1

> Duplicate columns in CarbonWriter is throwing NullPointerException
> --
>
> Key: CARBONDATA-2514
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2514
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Minor
> Fix For: 1.4.1
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2339: [WIP][CARBONDATA-2525] Search mode can not register ...

2018-05-23 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2339
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5084/



---


[GitHub] carbondata issue #2336: [CARBONDATA-2521] Support create carbonReader withou...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2336
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6086/



---


[GitHub] carbondata issue #2336: [CARBONDATA-2521] Support create carbonReader withou...

2018-05-23 Thread xubo245
Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/2336
  
retest this please


---


[GitHub] carbondata pull request #2327: [WIP] Bloom remove guava cache and use Carbon...

2018-05-23 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2327#discussion_r190448358
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/cache/CacheProvider.java ---
@@ -101,6 +102,31 @@ public static CacheProvider getInstance() {
 return cacheTypeToCacheMap.get(cacheType);
   }
 
+  /**
+   * This method will check if a cache already exists for given cache type 
and store
+   * if it is not present in the map
+   */
+  public  Cache createCache(CacheType cacheType, String 
cacheClassName)
--- End diff --

Then it will be a disadvantage compared to guava-cache. Besides, I think 
it's quite useful to configure the size of cache in fine granularity.


---


[jira] [Resolved] (CARBONDATA-2221) Drop table should throw exception when metastore operation failed

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-2221.
-
Resolution: Fixed

> Drop table should throw exception when metastore operation failed
> -
>
> Key: CARBONDATA-2221
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2221
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Jacky Li
>Priority: Major
> Fix For: 1.4.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2188) Support DataMap developer interface

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-2188:

Fix Version/s: (was: 1.4.0)

> Support DataMap developer interface
> ---
>
> Key: CARBONDATA-2188
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2188
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Jacky Li
>Priority: Major
>
> Developer should be able to add new DataMap implementation, developer 
> interface should be added



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2331: [CARBONDATA-2507] enable.offheap.sort not validate i...

2018-05-23 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2331
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5068/



---


[jira] [Created] (CARBONDATA-2521) Support create carbonReader without tableName

2018-05-23 Thread xubo245 (JIRA)
xubo245 created CARBONDATA-2521:
---

 Summary: Support create carbonReader without tableName
 Key: CARBONDATA-2521
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2521
 Project: CarbonData
  Issue Type: Improvement
Reporter: xubo245
Assignee: xubo245


Support create carbonReader without tableName



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2318
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4909/



---


[GitHub] carbondata issue #2321: [CARBONDATA-2520] Clean and close datamap writers on...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2321
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6070/



---


[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...

2018-05-23 Thread xubo245
Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/2318
  
@jackylk @ravipesala Hello, sounakr give LGTM and CI pass. Can you help to 
check and merge it if there are no problem, please.


---


[GitHub] carbondata issue #2308: [WIP]Adding SDV Testcases for SDKwriter

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2308
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6067/



---


[jira] [Updated] (CARBONDATA-1332) Dictionary generation time in spark 2.1 is more than spark 1.5

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-1332:

Fix Version/s: (was: 1.4.0)

> Dictionary generation time in spark 2.1 is more than spark 1.5
> --
>
> Key: CARBONDATA-1332
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1332
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.1.1
>Reporter: Venkata Ramana G
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-1141) Data load is partially successful but delete error

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-1141:

Fix Version/s: (was: 1.4.0)

> Data load is partially successful  but delete error
> ---
>
> Key: CARBONDATA-1141
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1141
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration, sql
>Affects Versions: 1.2.0
> Environment: spark on 
> yarn,carbondata1.2.0,hadoop2.7,spark2.1.0,hive2.1.0
>Reporter: zhuzhibin
>Priority: Major
> Attachments: error.png, error1.png
>
>
> when I tried to load data into table (data size is about 300 million),the log 
> showed me that “Data load is partially successful for table",
> but when I executed delete table operation,some errors appeared,the error 
> message is "java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.carbondata.core.mutate.CarbonUpdateUtil.getRequiredFieldFromTID(CarbonUpdateUtil.java:67)".
> when I executed another delete table operation with where condition,it was 
> succeeful,but executed select operation then appeared 
> "java.lang.ArrayIndexOutOfBoundsException Driver stacktrace:
>   at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435)"
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2276) Support SDK API to read schema in data file and schema file

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-2276.
-
Resolution: Fixed

> Support SDK API to read schema in data file and schema file
> ---
>
> Key: CARBONDATA-2276
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2276
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Jacky Li
>Assignee: Jacky Li
>Priority: Major
> Fix For: 1.4.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2318
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6069/



---


[GitHub] carbondata issue #2331: [CARBONDATA-2507] enable.offheap.sort not validate i...

2018-05-23 Thread xubo245
Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/2331
  
retest this please


---


[GitHub] carbondata issue #2335: [WIP] integrate carbonstore mv branch

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4910/



---


[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...

2018-05-23 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2318
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/5066/



---


[jira] [Created] (CARBONDATA-2520) datamap writers are not getting closed on task failure

2018-05-23 Thread Akash R Nilugal (JIRA)
Akash R Nilugal created CARBONDATA-2520:
---

 Summary: datamap writers are not getting closed on task failure
 Key: CARBONDATA-2520
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2520
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


*Problem:* The datamap writers registered to listener are closed or finished 
only in case of load success case and not in any failure case. So when tesing 
lucene, it is found that, after task is failed and the writer is not closed, so 
the write.lock file written in the index folder of lucene is still exists, so 
when next task comes to write index in same directory, it fails with the error 
lock file already exists.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2332: [CARBONDATA-2514] Added condition to check for dupli...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2332
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6068/



---


[jira] [Updated] (CARBONDATA-2153) Failed to update table status for pre-aggregate table when maintain insert twice and auto merge open

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-2153:

Fix Version/s: (was: 1.4.0)

> Failed to update table status for pre-aggregate table when maintain insert 
> twice  and auto merge open
> -
>
> Key: CARBONDATA-2153
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2153
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core, spark-integration
>Affects Versions: 1.3.0
>Reporter: xubo245
>Assignee: xubo245
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Failed to update table status for pre-aggregate table when maintain insert 
> twice



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2516) Filter Greater-than for timestamp datatype not generating Expression in PrestoFilterUtil

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-2516:

Fix Version/s: (was: 1.4.0)

> Filter Greater-than for timestamp datatype not generating Expression in 
> PrestoFilterUtil
> 
>
> Key: CARBONDATA-2516
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2516
> Project: CarbonData
>  Issue Type: Bug
>  Components: presto-integration
>Affects Versions: 1.4.0
>Reporter: Sourabh Verma
>Priority: Major
>
> Scenario - carbon-data Table 'load_table' with columns 'integer', 'datetime'
> //table creation and load code (spark)
> val random = new Random()
> val df = spark.sparkContext.parallelize(1 to (365 * 24 * 360))
> .map(x => (random.nextInt(200), new Timestamp(currentMillis - (x * 1000l
> .toDF("integer", "datetime")
> // Saves dataframe to carbondata file
> df.write.format("carbondata")
> .option("tableName", "load_table")
> .option("compress", "true")
> .option("tempCSV", "false")
> .mode(SaveMode.Overwrite)
> .save()
> SQL (through Presto CLI) - select * from load_table where datetime > 
> date_parse('2018-05-10 18:22:15', '%Y-%m-%d %T');
> Issue - Carbondata is having full scan over the files although we have passed 
> greater than expression filter on timestamp.
> cause - PrestoFilterUtil is not creating greater than Expression for 
> timestamp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2072) Add dropTables method for optimizing drop table operation in test cases

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-2072:

Fix Version/s: (was: 1.4.0)

> Add dropTables method for optimizing drop table operation in test cases
> ---
>
> Key: CARBONDATA-2072
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2072
> Project: CarbonData
>  Issue Type: Test
>  Components: test
>Affects Versions: 1.3.0
>Reporter: xubo245
>Assignee: xubo245
>Priority: Minor
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> There are many drop table in beforeAll or afterAll of test cases,like this:
> {code:java}
>  override def afterAll {
> sql("drop table if exists load")
> sql("drop table if exists inser")
> sql("DROP TABLE IF EXISTS THive")
> sql("DROP TABLE IF EXISTS TCarbon")
> sql("drop table if exists TCarbonLocal")
> sql("drop table if exists TCarbonSource")
> sql("drop table if exists loadtable")
> sql("drop table if exists insertTable")
> sql("drop table if exists CarbonDest")
> sql("drop table if exists HiveDest")
> sql("drop table if exists CarbonOverwrite")
> sql("drop table if exists HiveOverwrite")
> sql("drop table if exists tcarbonsourceoverwrite")
> sql("drop table if exists carbon_table1")
> sql("drop table if exists carbon_table")
> sql("DROP TABLE IF EXISTS student")
> sql("DROP TABLE IF EXISTS uniqdata")
> sql("DROP TABLE IF EXISTS show_insert")
> sql("drop table if exists OverwriteTable_t1")
> sql("drop table if exists OverwriteTable_t2")
> }
> {code}
> in 
> org.apache.carbondata.spark.testsuite.allqueries.InsertIntoCarbonTableTestCase
> It can be optimized by a public method in QueryTest



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2331: [CARBONDATA-2507] enable.offheap.sort not validate i...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2331
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6065/



---


[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...

2018-05-23 Thread sounakr
Github user sounakr commented on the issue:

https://github.com/apache/carbondata/pull/2318
  
LGTM


---


[jira] [Updated] (CARBONDATA-1014) Refactor on data loading and encoding override

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-1014:

Fix Version/s: (was: 1.4.0)

> Refactor on data loading and encoding override
> --
>
> Key: CARBONDATA-1014
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1014
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Priority: Major
>
> Refactor on current data loading flow to make it:
> 1. Use vectorized processing as early as possible
> 2. Make index build (sorting) CPU cache efficient, by using rowId and key 
> column vector to sort
> 3. Open interface for format extension, including column encoding, 
> compression, statistics.
> Design doc will be posted in this JIRA soon.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2154) Carbon should support match pre-aggregate table when SET carbon.input.segments.default.carbon_table=*

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-2154:

Fix Version/s: (was: 1.4.0)

> Carbon should support match pre-aggregate table when SET 
> carbon.input.segments.default.carbon_table=*
> -
>
> Key: CARBONDATA-2154
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2154
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Affects Versions: 1.3.0
>Reporter: xubo245
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Carbon should support match pre-aggregate table when SET 
> carbon.input.segments.default.carbon_table=*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2337) Fix duplicately acquiring 'streaming.lock' error when integrating with spark-streaming

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-2337.
-
Resolution: Fixed

> Fix duplicately acquiring 'streaming.lock' error when integrating with 
> spark-streaming
> --
>
> Key: CARBONDATA-2337
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2337
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.4.0
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
> Fix For: 1.4.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> After merged [PR2135|[https://github.com/apache/carbondata/pull/2135]] it 
> will acquire 'streaming.lock' duplicately when integrating with 
> spark-streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2233) Improve test cases of DBLOcationCarbonTableTestCase

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-2233:

Fix Version/s: (was: 1.4.0)

> Improve test cases of   DBLOcationCarbonTableTestCase
> -
>
> Key: CARBONDATA-2233
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2233
> Project: CarbonData
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 1.4.0
>Reporter: anubhav tarar
>Assignee: anubhav tarar
>Priority: Trivial
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2169) Conflicting classes cause NoSuchMethodError, when our project using org.apache.carbondata:carbondata-hive:1.3.0

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-2169:

Fix Version/s: (was: 1.4.0)

> Conflicting classes cause NoSuchMethodError, when our project using 
> org.apache.carbondata:carbondata-hive:1.3.0
> ---
>
> Key: CARBONDATA-2169
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2169
> Project: CarbonData
>  Issue Type: Bug
>  Components: hive-integration
>Affects Versions: 1.3.0
>Reporter: PandaMonkey
>Priority: Minor
> Attachments: carbondata conflicts.txt
>
>
> Hi, when we using org.apache.carbondata:carbondata-hive:1.3.0, we got 
> *NoSuchMethodError*. And by analyzing the source code, we found the root 
> cause is conflicting classes in different JARs. It means that duplicate 
> classes exist in different JARs, but they have different features, which 
> leads to the really loaded classes are not the actually required ones of our 
> project. (As JVM only load the classes present first on the classpath and 
> shadow the other duplicate ones with the same name.) And such conflictiing 
> problems exist in several JAR pairs dependent by carbondata-hive:1.3.0. The 
> detailed conflicting info is listed in the attachment.
> Conflicting Jar-pairs:
> jar-pair:
>  
> jar-pair:
>  
> jar-pair:
>  
> jar-pair:
>  
> jar-pair:
>  
> jar-pair:
>  jar-pair:
>  
> jar-pair:
>  
> jar-pair:
>  jar-pair:
>  
> jar-pair:
>  
> jar-pair:
>  
> jar-pair:
>  
> jar-pair:
>  jar-pair:
>  jar-pair:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2000) Unable to save a dataframe result as carbondata streaming table

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-2000:

Fix Version/s: (was: 1.4.0)

> Unable to save a dataframe result as carbondata streaming table
> ---
>
> Key: CARBONDATA-2000
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2000
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.3.0
> Environment: spark-2.1
>Reporter: anubhav tarar
>Assignee: anubhav tarar
>Priority: Trivial
>
> 1.create carbonsession
>  import org.apache.spark.sql.SparkSession
> import org.apache.spark.sql.CarbonSession._
> val carbon = SparkSession.builder().config(sc.getConf) 
> .getOrCreateCarbonSession("hdfs://localhost:54311/newCarbonStore","/tmp"
> 2.create a dataframe with carbonsession
> import carbon.sqlContext.implicits._
> carbon.sql("drop table if exists streamingtable"); 
> val df =carbon.sparkContext.parallelize(1 to 5).toDF("colId")
> 3.register dataframe as carbon streaming table
>  
> df.write.format("carbondata").option("tableName","streamingTable").option("streaming","true").mode(SaveMode.Overwrite).save
> 4,desc formatted the table
> carbon.sql("describe formatted streamingTable").show(100)
> ++++
> |col_name|   data_type| comment|
> ++++
> |colid...|int  ...|MEASURE,null ...|
> | ...| ...| ...|
> |##Detailed Table ...| ...| ...|
> |Database Name...|default  ...| ...|
> |Table Name   ...|streamingtable   ...| ...|
> |CARBON Store Path...|hdfs://localhost:...| ...|
> |Comment  ...| ...| ...|
> |Table Block Size ...|1024 MB  ...| ...|
> |Table Data Size  ...|316  ...| ...|
> |Table Index Size ...|283  ...| ...|
> |Last Update Time ...|1515393447642...| ...|
> |SORT_SCOPE   ...|LOCAL_SORT   ...|LOCAL_SORT   ...|
> |Streaming...|false...| ...|
> |SORT_SCOPE   ...|LOCAL_SORT   ...|LOCAL_SORT   ...|
> | ...| ...| ...|
> |##Detailed Column...| ...| ...|
> |ADAPTIVE ...| ...| ...|
> |SORT_COLUMNS ...| ...| ...|
> ++++



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2332: [CARBONDATA-2514] Added condition to check for dupli...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2332
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4908/



---


[GitHub] carbondata issue #2335: [WIP] integrate carbonstore mv branch

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2335
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6071/



---


[GitHub] carbondata pull request #2336: [CARBONDATA-2521] Support create carbonReader...

2018-05-23 Thread xubo245
GitHub user xubo245 opened a pull request:

https://github.com/apache/carbondata/pull/2336

[CARBONDATA-2521] Support create carbonReader without tableName

Add new method for creating carbonReader without tableName
Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 No
 - [ ] Any backward compatibility impacted?
 NA
 - [ ] Document update required?
No
 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   Add some test case
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
No

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xubo245/carbondata 
CARBONDATA-2521-supportReaderWithoutTableName

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2336.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2336


commit 80b4c9a724513d79439cbfcba62605648a4b5450
Author: xubo245 
Date:   2018-05-23T13:08:23Z

[CARBONDATA-2521] Support create carbonReader without tableName




---


[GitHub] carbondata issue #2332: [CARBONDATA-2514] Added condition to check for dupli...

2018-05-23 Thread kunal642
Github user kunal642 commented on the issue:

https://github.com/apache/carbondata/pull/2332
  
retest this please


---


[GitHub] carbondata issue #2331: [CARBONDATA-2507] enable.offheap.sort not validate i...

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2331
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4905/



---


[GitHub] carbondata issue #2314: [CARBONDATA-2309][DataLoad] Add strategy to generate...

2018-05-23 Thread ndwangsen
Github user ndwangsen commented on the issue:

https://github.com/apache/carbondata/pull/2314
  
@kumarvishal09  If the user specifies or default that the minimum data load 
of the node is less than the average data amount of each node, the existing 
strategy is used to handle


---


[jira] [Updated] (CARBONDATA-2099) Refactor on query scan process to improve readability

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-2099:

Fix Version/s: (was: 1.4.0)

> Refactor on query scan process to improve readability
> -
>
> Key: CARBONDATA-2099
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2099
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Priority: Major
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-1600) Carbon store abstraction

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-1600:

Fix Version/s: (was: 1.4.0)
   NONE

> Carbon store abstraction
> 
>
> Key: CARBONDATA-1600
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1600
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Assignee: Jacky Li
>Priority: Major
> Fix For: NONE
>
>
> There should be a carbondata-store module to abstract all functionalities 
> that above file format, and provide developer API to support different 
> compute engine to integrate with carbon



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-1871) Add annotation for interface compatibility

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-1871:

Fix Version/s: (was: 1.4.0)

> Add annotation for interface compatibility
> --
>
> Key: CARBONDATA-1871
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1871
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Priority: Major
>
> All use facing API should be annotated with proper stability level. 
> InterfaceStability level includes: 
> 1. Forever: API in this level is compatible across major version
> 2. Stable: API in this level is compatible across minor version, maybe break 
> across major version
> 3. Evolving: API in this level is compatible across maintenance version, 
> maybe break across minor version
> 4. Unstable: API in this level is not backward compatible guranteed
> Since user mainly use SQL for carbondata, the API need to be annotated 
> includes:
> 1. Table Property in create table
> 2. Load Option in load data and dataframe api
> 3. Carbon Property



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2159) Remove carbon-spark dependency for sdk module

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-2159.
-
Resolution: Fixed

> Remove carbon-spark dependency for sdk module
> -
>
> Key: CARBONDATA-2159
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2159
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Priority: Major
> Fix For: 1.4.0
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> store-sdk module should not depend on carbon-spark module



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2162) Remove spark dependency in carbon-core and carbon-processing module

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-2162:

Fix Version/s: (was: 1.4.0)

> Remove spark dependency in carbon-core and carbon-processing module
> ---
>
> Key: CARBONDATA-2162
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2162
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Priority: Major
>
> The assembly JAR of store-sdk module should be small, but currently it 
> includes spark JAR because carbon-core, carbon-processing, carbon-hadoop 
> modules depends on spark.
> This dependency should be removed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-1825) Carbon 1.3.0 - Spark 2.2- Data load fails on 20k columns carbon table with CarbonDataWriterException

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-1825:

Fix Version/s: (was: 1.4.0)

> Carbon 1.3.0 - Spark 2.2- Data load fails on 20k columns carbon table with 
> CarbonDataWriterException
> 
>
> Key: CARBONDATA-1825
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1825
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>Assignee: kumar vishal
>Priority: Minor
>  Labels: DFX
>
> Steps:
> Beeline:
> 1. Create carbon table with 20k columns
> 2. Run table load
> *+Expected:+* Table load should be success
> *+Actual:+*  table load fails



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2515) Filter OR Expression not working properly in Presto integration

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-2515:

Fix Version/s: (was: 1.4.0)
   1.4.1

> Filter OR Expression not working properly in Presto integration
> ---
>
> Key: CARBONDATA-2515
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2515
> Project: CarbonData
>  Issue Type: Bug
>  Components: presto-integration
>Affects Versions: 1.4.0
> Environment: Spark 2.1, Presto 0.187
>Reporter: Sourabh Verma
>Priority: Major
> Fix For: 1.4.1
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Scenario - carbon-data Table 'load_table' with columns 'integer', 'datetime'
> //table creation and load code (spark)
>  val random = new Random()
>  val df = spark.sparkContext.parallelize(1 to (365 * 24 * 360))
>  .map(x => (random.nextInt(200), new Timestamp(currentMillis - (x * 1000l
>  .toDF("integer", "datetime")
> // Saves dataframe to carbondata file
>  df.write.format("carbondata")
>  .option("tableName", "load_table")
>  .option("compress", "true")
>  .option("tempCSV", "false")
>  .mode(SaveMode.Overwrite)
>  .save()
> SQL (through Presto CLI) - select * from load_table where integer < 10 or 
> integer > 50;
> Actual result - 0 rows.
>  Expected result - rows with integer value less than 10 and greater than 50.
> cause - PrestoFilterUtil is creating AND Expressions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-1603) support user specified segments in major compaction

2018-05-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-1603:

Fix Version/s: (was: 1.4.0)

> support user specified segments in major compaction
> ---
>
> Key: CARBONDATA-1603
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1603
> Project: CarbonData
>  Issue Type: Improvement
>  Components: data-load
>Affects Versions: 1.3.0
>Reporter: Zhoujin
>Priority: Minor
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> support user specified segments in major compaction
> the proposed syntax: 
> ALTER TABLE [db_name].table_name COMPACT [SEGMENT seg_id1,seg_id2] 'MAJOR' 
> in which [SEGMENT seg_id1,seg_id2] is optional and compatible with original 
> syntax. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2337: [CARBONDATA-2519] Add document for CarbonReader

2018-05-23 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2337
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6078/



---


[GitHub] carbondata issue #2331: [CARBONDATA-2507] enable.offheap.sort not validate i...

2018-05-23 Thread xubo245
Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/2331
  
@sraghunandan @rahulforallp This PR fix possible problem in #2274, Please 
review it.


---


  1   2   >