[GitHub] carbondata issue #2321: [WIP]clean and close datamap writers on any task fai...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2321 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5063/ ---
[jira] [Issue Comment Deleted] (CARBONDATA-2519) Add document for CarbonReader
[ https://issues.apache.org/jira/browse/CARBONDATA-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated CARBONDATA-2519: Comment: was deleted (was: Add document for CarbonReader, and change the carbon writer guide document) > Add document for CarbonReader > - > > Key: CARBONDATA-2519 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2519 > Project: CarbonData > Issue Type: Improvement >Reporter: xubo245 >Assignee: xubo245 >Priority: Major > > Add document for CarbonReader, and change the carbon writer guide document -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2519) Add document for CarbonReader
[ https://issues.apache.org/jira/browse/CARBONDATA-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated CARBONDATA-2519: Description: Add document for CarbonReader, and change the carbon writer guide document > Add document for CarbonReader > - > > Key: CARBONDATA-2519 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2519 > Project: CarbonData > Issue Type: Improvement >Reporter: xubo245 >Assignee: xubo245 >Priority: Major > > Add document for CarbonReader, and change the carbon writer guide document -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2318 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4902/ ---
[GitHub] carbondata issue #2308: [WIP]Adding SDV Testcases for SDKwriter
Github user Indhumathi27 commented on the issue: https://github.com/apache/carbondata/pull/2308 retest sdv please ---
[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2318 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6062/ ---
[GitHub] carbondata issue #2332: [CARBONDATA-2514] Added condition to check for dupli...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2332 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6066/ ---
[GitHub] carbondata pull request #2252: [CARBONDATA-2420] Support string longer than ...
Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2252#discussion_r190150092 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/chunk/reader/dimension/v1/CompressedDimensionChunkFileBasedReaderV1.java --- @@ -99,6 +99,7 @@ public CompressedDimensionChunkFileBasedReaderV1(final BlockletInfo blockletInfo @Override public DimensionColumnPage decodeColumnPage( DimensionRawColumnChunk dimensionRawColumnChunk, int pageNumber) throws IOException { +boolean isLongStringColumn = dimensionRawColumnChunk.isLongStringColumn(); --- End diff -- No we support V3 while writing the data...v1/v2 is supported only for reading to support backward compatibility ---
[GitHub] carbondata issue #2321: [WIP]clean and close datamap writers on any task fai...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2321 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4900/ ---
[GitHub] carbondata issue #2308: [WIP]Adding SDV Testcases for SDKwriter
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2308 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4901/ ---
[GitHub] carbondata issue #2321: [WIP]clean and close datamap writers on any task fai...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2321 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6059/ ---
[GitHub] carbondata issue #2333: [WIP] Change the query flow while selecting the carb...
Github user rahulforallp commented on the issue: https://github.com/apache/carbondata/pull/2333 retest this please ---
[GitHub] carbondata issue #2321: [WIP]clean and close datamap writers on any task fai...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2321 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6060/ ---
[GitHub] carbondata issue #2308: [WIP]Adding SDV Testcases for SDKwriter
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2308 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5065/ ---
[GitHub] carbondata issue #2320: [Documentation] Editorial review comment fixed
Github user kunal642 commented on the issue: https://github.com/apache/carbondata/pull/2320 LGTM ---
[jira] [Created] (CARBONDATA-2519) Add document
xubo245 created CARBONDATA-2519: --- Summary: Add document Key: CARBONDATA-2519 URL: https://issues.apache.org/jira/browse/CARBONDATA-2519 Project: CarbonData Issue Type: Improvement Reporter: xubo245 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #2318: [CARBONDATA-2491] Fix the error when reader r...
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2318#discussion_r190151696 --- Diff: store/sdk/src/test/java/org/apache/carbondata/sdk/file/CarbonReaderTest.java --- @@ -177,4 +239,134 @@ public void testWriteAndReadFilesNonTransactional() throws IOException, Interrup reader.close(); FileUtils.deleteDirectory(new File(path)); } + + CarbonProperties carbonProperties; + + @Override + public void setUp() { +carbonProperties = CarbonProperties.getInstance(); + } + + private static final LogService LOGGER = + LogServiceFactory.getLogService(CarbonReaderTest.class.getName()); + + @Test + public void testTimeStampAndBadRecord() throws IOException, InterruptedException { +String timestampFormat = carbonProperties.getProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, +CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT); +String badRecordAction = carbonProperties.getProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, +CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION_DEFAULT); +String badRecordLoc = carbonProperties.getProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC, +CarbonCommonConstants.CARBON_BADRECORDS_LOC_DEFAULT_VAL); +String rootPath = new File(this.getClass().getResource("/").getPath() ++ "../../").getCanonicalPath(); +String storeLocation = rootPath + "/target/"; +carbonProperties +.addProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC, storeLocation) +.addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "-MM-dd hh:mm:ss") +.addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, "REDIRECT"); +String path = "./testWriteFiles"; +FileUtils.deleteDirectory(new File(path)); + +Field[] fields = new Field[9]; +fields[0] = new Field("stringField", DataTypes.STRING); +fields[1] = new Field("intField", DataTypes.INT); +fields[2] = new Field("shortField", DataTypes.SHORT); +fields[3] = new Field("longField", DataTypes.LONG); +fields[4] = new Field("doubleField", DataTypes.DOUBLE); +fields[5] = new Field("boolField", DataTypes.BOOLEAN); +fields[6] = new Field("dateField", DataTypes.DATE); +fields[7] = new Field("timeField", DataTypes.TIMESTAMP); +fields[8] = new Field("decimalField", DataTypes.createDecimalType(8, 2)); + +try { + CarbonWriterBuilder builder = CarbonWriter.builder() + .isTransactionalTable(true) + .persistSchemaFile(true) + .outputPath(path); + + CarbonWriter writer = builder.buildWriterForCSVInput(new Schema(fields)); + + for (int i = 0; i < 100; i++) { +String[] row = new String[]{ +"robot" + (i % 10), +String.valueOf(i), +String.valueOf(i), +String.valueOf(Long.MAX_VALUE - i), +String.valueOf((double) i / 2), +String.valueOf(true), +"2018-05-12", +"2018-05-12", +"12.345" +}; +writer.write(row); +String[] row2 = new String[]{ +"robot" + (i % 10), +String.valueOf(i), +String.valueOf(i), +String.valueOf(Long.MAX_VALUE - i), +String.valueOf((double) i / 2), +String.valueOf(true), +"2019-03-02", +"2019-02-12 03:03:34", +"12.345" +}; +writer.write(row2); + } + writer.close(); +} catch (Exception e) { + e.printStackTrace(); + Assert.fail(e.getMessage()); +} +LOGGER.audit("Bad record location:" + storeLocation); +File segmentFolder = new File(CarbonTablePath.getSegmentPath(path, "null")); +Assert.assertTrue(segmentFolder.exists()); + +File[] dataFiles = segmentFolder.listFiles(new FileFilter() { + @Override + public boolean accept(File pathname) { +return pathname.getName().endsWith(CarbonCommonConstants.FACT_FILE_EXT); + } +}); +Assert.assertNotNull(dataFiles); +Assert.assertTrue(dataFiles.length > 0); + +CarbonReader reader = CarbonReader.builder(path, "_temp") +.projection(new String[]{ +"stringField" +, "shortField" +, "intField" +, "longField" +, "doubleField" +, "boolField" +, "dateField" +, "timeField" +, "decimalField"}).build(); + +int i = 0; +while (reader.hasNext()) { --- End diff --
[GitHub] carbondata pull request #2252: [CARBONDATA-2420] Support string longer than ...
Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2252#discussion_r190161224 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/chunk/impl/DimensionRawColumnChunk.java --- @@ -32,13 +32,14 @@ * by specifying page number. */ public class DimensionRawColumnChunk extends AbstractRawColumnChunk { - private DimensionColumnPage[] dataChunks; private DimensionColumnChunkReader chunkReader; private FileReader fileReader; + private boolean isLongStringColumn; --- End diff -- how u are deciding it is isLongStringColumn or not ?? ---
[GitHub] carbondata pull request #2252: [CARBONDATA-2420] Support string longer than ...
Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2252#discussion_r190160809 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentProperties.java --- @@ -849,7 +852,41 @@ public int getNumberOfDictSortColumns() { return this.numberOfSortColumns - this.numberOfNoDictSortColumns; } + public int getNumberOfLongStringColumns() { +return numberOfLongStringColumns; + } + public int getLastDimensionColOrdinal() { return lastDimensionColOrdinal; } + + @Override public String toString() { --- End diff -- Why this method is required ??...can u add some comment ---
[GitHub] carbondata issue #2252: [CARBONDATA-2420] Support string longer than 32000 c...
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/2252 @xuchuanyin how u are deciding isLongStringColumn is true or false in query?? ---
[jira] [Created] (CARBONDATA-2518) Add a new method for CarbonReader to transform Data and TimeStamp data type
xubo245 created CARBONDATA-2518: --- Summary: Add a new method for CarbonReader to transform Data and TimeStamp data type Key: CARBONDATA-2518 URL: https://issues.apache.org/jira/browse/CARBONDATA-2518 Project: CarbonData Issue Type: Improvement Reporter: xubo245 Assignee: xubo245 org.apache.carbondata.sdk.file.CarbonReader#readNextRow return the int for Date and long value for timestamp. It's Inconvenient. We need add a new method for CarbonReader to transform Data and TimeStamp data type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2321: [WIP]clean and close datamap writers on any task fai...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2321 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5064/ ---
[GitHub] carbondata issue #2308: [WIP]Adding SDV Testcases for SDKwriter
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2308 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6061/ ---
[GitHub] carbondata issue #2333: [WIP] Change the query flow while selecting the carb...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2333 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6063/ ---
[GitHub] carbondata issue #2336: [CARBONDATA-2521] Support create carbonReader withou...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2336 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4919/ ---
[GitHub] carbondata issue #2338: [CARBONDATA-2524] Support create carbonReader with d...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2338 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6082/ ---
[GitHub] carbondata pull request #2314: [CARBONDATA-2309][DataLoad] Add strategy to g...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2314#discussion_r190455538 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java --- @@ -575,11 +577,12 @@ public static Dictionary getDictionary(AbsoluteTableIdentifier absoluteTableIden * @param noOfNodesInput -1 if number of nodes has to be decided * based on block location information * @param blockAssignmentStrategy strategy used to assign blocks + * @param loadMinSize the property load_min_size_inmb specified by the user * @return a map that maps node to blocks */ public static MapnodeBlockMapping( List blockInfos, int noOfNodesInput, List activeNodes, - BlockAssignmentStrategy blockAssignmentStrategy) { + BlockAssignmentStrategy blockAssignmentStrategy, String loadMinSize ) { --- End diff -- How about changing the name `loadMinSize` to `expectedMinSizePerNode`? ---
[GitHub] carbondata pull request #2314: [CARBONDATA-2309][DataLoad] Add strategy to g...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2314#discussion_r190455207 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java --- @@ -596,20 +599,50 @@ public static Dictionary getDictionary(AbsoluteTableIdentifier absoluteTableIden // calculate the average expected size for each node long sizePerNode = 0; +long totalFileSize = 0; if (BlockAssignmentStrategy.BLOCK_NUM_FIRST == blockAssignmentStrategy) { sizePerNode = blockInfos.size() / noofNodes; sizePerNode = sizePerNode <= 0 ? 1 : sizePerNode; -} else if (BlockAssignmentStrategy.BLOCK_SIZE_FIRST == blockAssignmentStrategy) { - long totalFileSize = 0; +} else if (BlockAssignmentStrategy.BLOCK_SIZE_FIRST == blockAssignmentStrategy +|| BlockAssignmentStrategy.NODE_MIN_SIZE_FIRST == blockAssignmentStrategy) { for (Distributable blockInfo : uniqueBlocks) { totalFileSize += ((TableBlockInfo) blockInfo).getBlockLength(); } sizePerNode = totalFileSize / noofNodes; } -// assign blocks to each node -assignBlocksByDataLocality(rtnNode2Blocks, sizePerNode, uniqueBlocks, originNode2Blocks, -activeNodes, blockAssignmentStrategy); +// if enable to control the minimum amount of input data for each node +if (BlockAssignmentStrategy.NODE_MIN_SIZE_FIRST == blockAssignmentStrategy) { + long iLoadMinSize = 0; + // validate the property load_min_size_inmb specified by the user + if (CarbonUtil.validateValidIntType(loadMinSize)) { +iLoadMinSize = Integer.parseInt(loadMinSize); + } else { +LOGGER.warn("Invalid load_min_size_inmb value found: " + loadMinSize ++ ", only int value greater than 0 is supported."); +iLoadMinSize = CarbonCommonConstants.CARBON_LOAD_MIN_SIZE_DEFAULT; + } + // If the average expected size for each node greater than load min size, + // then fall back to default strategy + if (iLoadMinSize * 1024 * 1024 < sizePerNode) { +if (CarbonProperties.getInstance().isLoadSkewedDataOptimizationEnabled()) { + blockAssignmentStrategy = BlockAssignmentStrategy.BLOCK_SIZE_FIRST; +} else { + blockAssignmentStrategy = BlockAssignmentStrategy.BLOCK_NUM_FIRST; +} + } else { --- End diff -- Better to add log ``` LOG.info("Specified minimum data size to load is less than the average size for each node, fallback to default strategy" + blockAssignmentStrategy); ``` ---
[GitHub] carbondata issue #2314: [CARBONDATA-2309][DataLoad] Add strategy to generate...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2314 @ndwangsen please resolve the conflicts and review comments ---
[jira] [Created] (CARBONDATA-2525) Search mode can not register master in security cluster
xubo245 created CARBONDATA-2525: --- Summary: Search mode can not register master in security cluster Key: CARBONDATA-2525 URL: https://issues.apache.org/jira/browse/CARBONDATA-2525 Project: CarbonData Issue Type: Bug Reporter: xubo245 Assignee: xubo245 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2290: [CARBONDATA-2389] Search mode support lucene datamap
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2290 Please help to review it. @jackylk @ravipesala @QiangCai ---
[GitHub] carbondata issue #2339: [WIP][CARBONDATA-2525] Search mode can not register ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2339 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4926/ ---
[GitHub] carbondata issue #2339: [WIP][CARBONDATA-2525] Search mode can not register ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2339 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6087/ ---
[GitHub] carbondata issue #2338: [CARBONDATA-2524] Support create carbonReader with d...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2338 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5081/ ---
[GitHub] carbondata issue #2336: [CARBONDATA-2521] Support create carbonReader withou...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2336 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6079/ ---
[GitHub] carbondata issue #2336: [CARBONDATA-2521] Support create carbonReader withou...
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2336 retest this please ---
[GitHub] carbondata issue #2337: [CARBONDATA-2519] Add document for CarbonReader
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2337 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6083/ ---
[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2318 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6084/ ---
[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2318 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4921/ ---
[GitHub] carbondata pull request #2332: [CARBONDATA-2514] Added condition to check fo...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2332 ---
[GitHub] carbondata issue #2337: [CARBONDATA-2519] Add document for CarbonReader
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2337 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4924/ ---
[GitHub] carbondata issue #2337: [CARBONDATA-2519] Add document for CarbonReader
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2337 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6085/ ---
[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2318 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5082/ ---
[GitHub] carbondata pull request #2339: [WIP][CARBONDATA-2525] Search mode can not re...
GitHub user xubo245 opened a pull request: https://github.com/apache/carbondata/pull/2339 [WIP][CARBONDATA-2525] Search mode can not register master in security cluster Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/xubo245/carbondata CARBONDATA-2525-registerMaster Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2339.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2339 commit c9962df700a4dfa3b7e11e6293a7bd851601cda5 Author: xubo245Date: 2018-05-24T03:59:14Z [CARBONDATA-2525] Search mode can not register master in security cluster ---
[GitHub] carbondata issue #2338: [CARBONDATA-2524] Support create carbonReader with d...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2338 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4922/ ---
[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2318 retest this please ---
[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2318 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4925/ ---
[GitHub] carbondata issue #2338: [CARBONDATA-2524] Support create carbonReader with d...
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2338 retest this please ---
[GitHub] carbondata issue #2337: [CARBONDATA-2519] Add document for CarbonReader
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2337 retest this please ---
[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2318 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6081/ ---
[GitHub] carbondata issue #2338: [CARBONDATA-2524] Support create carbonReader with d...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2338 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5083/ ---
[jira] [Resolved] (CARBONDATA-2514) Duplicate columns in CarbonWriter is throwing NullPointerException
[ https://issues.apache.org/jira/browse/CARBONDATA-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2514. -- Resolution: Fixed Assignee: Kunal Kapoor Fix Version/s: 1.4.1 > Duplicate columns in CarbonWriter is throwing NullPointerException > -- > > Key: CARBONDATA-2514 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2514 > Project: CarbonData > Issue Type: Bug >Reporter: Kunal Kapoor >Assignee: Kunal Kapoor >Priority: Minor > Fix For: 1.4.1 > > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2339: [WIP][CARBONDATA-2525] Search mode can not register ...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2339 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5084/ ---
[GitHub] carbondata issue #2336: [CARBONDATA-2521] Support create carbonReader withou...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2336 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6086/ ---
[GitHub] carbondata issue #2336: [CARBONDATA-2521] Support create carbonReader withou...
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2336 retest this please ---
[GitHub] carbondata pull request #2327: [WIP] Bloom remove guava cache and use Carbon...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2327#discussion_r190448358 --- Diff: core/src/main/java/org/apache/carbondata/core/cache/CacheProvider.java --- @@ -101,6 +102,31 @@ public static CacheProvider getInstance() { return cacheTypeToCacheMap.get(cacheType); } + /** + * This method will check if a cache already exists for given cache type and store + * if it is not present in the map + */ + publicCache createCache(CacheType cacheType, String cacheClassName) --- End diff -- Then it will be a disadvantage compared to guava-cache. Besides, I think it's quite useful to configure the size of cache in fine granularity. ---
[jira] [Resolved] (CARBONDATA-2221) Drop table should throw exception when metastore operation failed
[ https://issues.apache.org/jira/browse/CARBONDATA-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala resolved CARBONDATA-2221. - Resolution: Fixed > Drop table should throw exception when metastore operation failed > - > > Key: CARBONDATA-2221 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2221 > Project: CarbonData > Issue Type: Bug >Reporter: Jacky Li >Priority: Major > Fix For: 1.4.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2188) Support DataMap developer interface
[ https://issues.apache.org/jira/browse/CARBONDATA-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-2188: Fix Version/s: (was: 1.4.0) > Support DataMap developer interface > --- > > Key: CARBONDATA-2188 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2188 > Project: CarbonData > Issue Type: New Feature >Reporter: Jacky Li >Priority: Major > > Developer should be able to add new DataMap implementation, developer > interface should be added -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2331: [CARBONDATA-2507] enable.offheap.sort not validate i...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2331 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5068/ ---
[jira] [Created] (CARBONDATA-2521) Support create carbonReader without tableName
xubo245 created CARBONDATA-2521: --- Summary: Support create carbonReader without tableName Key: CARBONDATA-2521 URL: https://issues.apache.org/jira/browse/CARBONDATA-2521 Project: CarbonData Issue Type: Improvement Reporter: xubo245 Assignee: xubo245 Support create carbonReader without tableName -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2318 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4909/ ---
[GitHub] carbondata issue #2321: [CARBONDATA-2520] Clean and close datamap writers on...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2321 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6070/ ---
[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2318 @jackylk @ravipesala Hello, sounakr give LGTM and CI pass. Can you help to check and merge it if there are no problem, please. ---
[GitHub] carbondata issue #2308: [WIP]Adding SDV Testcases for SDKwriter
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2308 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6067/ ---
[jira] [Updated] (CARBONDATA-1332) Dictionary generation time in spark 2.1 is more than spark 1.5
[ https://issues.apache.org/jira/browse/CARBONDATA-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-1332: Fix Version/s: (was: 1.4.0) > Dictionary generation time in spark 2.1 is more than spark 1.5 > -- > > Key: CARBONDATA-1332 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1332 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.1.1 >Reporter: Venkata Ramana G >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-1141) Data load is partially successful but delete error
[ https://issues.apache.org/jira/browse/CARBONDATA-1141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-1141: Fix Version/s: (was: 1.4.0) > Data load is partially successful but delete error > --- > > Key: CARBONDATA-1141 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1141 > Project: CarbonData > Issue Type: Bug > Components: spark-integration, sql >Affects Versions: 1.2.0 > Environment: spark on > yarn,carbondata1.2.0,hadoop2.7,spark2.1.0,hive2.1.0 >Reporter: zhuzhibin >Priority: Major > Attachments: error.png, error1.png > > > when I tried to load data into table (data size is about 300 million),the log > showed me that “Data load is partially successful for table", > but when I executed delete table operation,some errors appeared,the error > message is "java.lang.ArrayIndexOutOfBoundsException: 1 > at > org.apache.carbondata.core.mutate.CarbonUpdateUtil.getRequiredFieldFromTID(CarbonUpdateUtil.java:67)". > when I executed another delete table operation with where condition,it was > succeeful,but executed select operation then appeared > "java.lang.ArrayIndexOutOfBoundsException Driver stacktrace: > at > org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435)" > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2276) Support SDK API to read schema in data file and schema file
[ https://issues.apache.org/jira/browse/CARBONDATA-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala resolved CARBONDATA-2276. - Resolution: Fixed > Support SDK API to read schema in data file and schema file > --- > > Key: CARBONDATA-2276 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2276 > Project: CarbonData > Issue Type: New Feature >Reporter: Jacky Li >Assignee: Jacky Li >Priority: Major > Fix For: 1.4.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2318 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6069/ ---
[GitHub] carbondata issue #2331: [CARBONDATA-2507] enable.offheap.sort not validate i...
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2331 retest this please ---
[GitHub] carbondata issue #2335: [WIP] integrate carbonstore mv branch
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2335 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4910/ ---
[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2318 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5066/ ---
[jira] [Created] (CARBONDATA-2520) datamap writers are not getting closed on task failure
Akash R Nilugal created CARBONDATA-2520: --- Summary: datamap writers are not getting closed on task failure Key: CARBONDATA-2520 URL: https://issues.apache.org/jira/browse/CARBONDATA-2520 Project: CarbonData Issue Type: Bug Components: data-load Reporter: Akash R Nilugal Assignee: Akash R Nilugal *Problem:* The datamap writers registered to listener are closed or finished only in case of load success case and not in any failure case. So when tesing lucene, it is found that, after task is failed and the writer is not closed, so the write.lock file written in the index folder of lucene is still exists, so when next task comes to write index in same directory, it fails with the error lock file already exists. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2332: [CARBONDATA-2514] Added condition to check for dupli...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2332 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6068/ ---
[jira] [Updated] (CARBONDATA-2153) Failed to update table status for pre-aggregate table when maintain insert twice and auto merge open
[ https://issues.apache.org/jira/browse/CARBONDATA-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-2153: Fix Version/s: (was: 1.4.0) > Failed to update table status for pre-aggregate table when maintain insert > twice and auto merge open > - > > Key: CARBONDATA-2153 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2153 > Project: CarbonData > Issue Type: Improvement > Components: core, spark-integration >Affects Versions: 1.3.0 >Reporter: xubo245 >Assignee: xubo245 >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > Failed to update table status for pre-aggregate table when maintain insert > twice -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2516) Filter Greater-than for timestamp datatype not generating Expression in PrestoFilterUtil
[ https://issues.apache.org/jira/browse/CARBONDATA-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-2516: Fix Version/s: (was: 1.4.0) > Filter Greater-than for timestamp datatype not generating Expression in > PrestoFilterUtil > > > Key: CARBONDATA-2516 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2516 > Project: CarbonData > Issue Type: Bug > Components: presto-integration >Affects Versions: 1.4.0 >Reporter: Sourabh Verma >Priority: Major > > Scenario - carbon-data Table 'load_table' with columns 'integer', 'datetime' > //table creation and load code (spark) > val random = new Random() > val df = spark.sparkContext.parallelize(1 to (365 * 24 * 360)) > .map(x => (random.nextInt(200), new Timestamp(currentMillis - (x * 1000l > .toDF("integer", "datetime") > // Saves dataframe to carbondata file > df.write.format("carbondata") > .option("tableName", "load_table") > .option("compress", "true") > .option("tempCSV", "false") > .mode(SaveMode.Overwrite) > .save() > SQL (through Presto CLI) - select * from load_table where datetime > > date_parse('2018-05-10 18:22:15', '%Y-%m-%d %T'); > Issue - Carbondata is having full scan over the files although we have passed > greater than expression filter on timestamp. > cause - PrestoFilterUtil is not creating greater than Expression for > timestamp. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2072) Add dropTables method for optimizing drop table operation in test cases
[ https://issues.apache.org/jira/browse/CARBONDATA-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-2072: Fix Version/s: (was: 1.4.0) > Add dropTables method for optimizing drop table operation in test cases > --- > > Key: CARBONDATA-2072 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2072 > Project: CarbonData > Issue Type: Test > Components: test >Affects Versions: 1.3.0 >Reporter: xubo245 >Assignee: xubo245 >Priority: Minor > Time Spent: 8h > Remaining Estimate: 0h > > There are many drop table in beforeAll or afterAll of test cases,like this: > {code:java} > override def afterAll { > sql("drop table if exists load") > sql("drop table if exists inser") > sql("DROP TABLE IF EXISTS THive") > sql("DROP TABLE IF EXISTS TCarbon") > sql("drop table if exists TCarbonLocal") > sql("drop table if exists TCarbonSource") > sql("drop table if exists loadtable") > sql("drop table if exists insertTable") > sql("drop table if exists CarbonDest") > sql("drop table if exists HiveDest") > sql("drop table if exists CarbonOverwrite") > sql("drop table if exists HiveOverwrite") > sql("drop table if exists tcarbonsourceoverwrite") > sql("drop table if exists carbon_table1") > sql("drop table if exists carbon_table") > sql("DROP TABLE IF EXISTS student") > sql("DROP TABLE IF EXISTS uniqdata") > sql("DROP TABLE IF EXISTS show_insert") > sql("drop table if exists OverwriteTable_t1") > sql("drop table if exists OverwriteTable_t2") > } > {code} > in > org.apache.carbondata.spark.testsuite.allqueries.InsertIntoCarbonTableTestCase > It can be optimized by a public method in QueryTest -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2331: [CARBONDATA-2507] enable.offheap.sort not validate i...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2331 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6065/ ---
[GitHub] carbondata issue #2318: [CARBONDATA-2491] Fix the error when reader read twi...
Github user sounakr commented on the issue: https://github.com/apache/carbondata/pull/2318 LGTM ---
[jira] [Updated] (CARBONDATA-1014) Refactor on data loading and encoding override
[ https://issues.apache.org/jira/browse/CARBONDATA-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-1014: Fix Version/s: (was: 1.4.0) > Refactor on data loading and encoding override > -- > > Key: CARBONDATA-1014 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1014 > Project: CarbonData > Issue Type: Improvement >Reporter: Jacky Li >Priority: Major > > Refactor on current data loading flow to make it: > 1. Use vectorized processing as early as possible > 2. Make index build (sorting) CPU cache efficient, by using rowId and key > column vector to sort > 3. Open interface for format extension, including column encoding, > compression, statistics. > Design doc will be posted in this JIRA soon. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2154) Carbon should support match pre-aggregate table when SET carbon.input.segments.default.carbon_table=*
[ https://issues.apache.org/jira/browse/CARBONDATA-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-2154: Fix Version/s: (was: 1.4.0) > Carbon should support match pre-aggregate table when SET > carbon.input.segments.default.carbon_table=* > - > > Key: CARBONDATA-2154 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2154 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Affects Versions: 1.3.0 >Reporter: xubo245 >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Carbon should support match pre-aggregate table when SET > carbon.input.segments.default.carbon_table=* -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2337) Fix duplicately acquiring 'streaming.lock' error when integrating with spark-streaming
[ https://issues.apache.org/jira/browse/CARBONDATA-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala resolved CARBONDATA-2337. - Resolution: Fixed > Fix duplicately acquiring 'streaming.lock' error when integrating with > spark-streaming > -- > > Key: CARBONDATA-2337 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2337 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.4.0 >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Fix For: 1.4.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > After merged [PR2135|[https://github.com/apache/carbondata/pull/2135]] it > will acquire 'streaming.lock' duplicately when integrating with > spark-streaming. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2233) Improve test cases of DBLOcationCarbonTableTestCase
[ https://issues.apache.org/jira/browse/CARBONDATA-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-2233: Fix Version/s: (was: 1.4.0) > Improve test cases of DBLOcationCarbonTableTestCase > - > > Key: CARBONDATA-2233 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2233 > Project: CarbonData > Issue Type: Improvement > Components: test >Affects Versions: 1.4.0 >Reporter: anubhav tarar >Assignee: anubhav tarar >Priority: Trivial > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2169) Conflicting classes cause NoSuchMethodError, when our project using org.apache.carbondata:carbondata-hive:1.3.0
[ https://issues.apache.org/jira/browse/CARBONDATA-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-2169: Fix Version/s: (was: 1.4.0) > Conflicting classes cause NoSuchMethodError, when our project using > org.apache.carbondata:carbondata-hive:1.3.0 > --- > > Key: CARBONDATA-2169 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2169 > Project: CarbonData > Issue Type: Bug > Components: hive-integration >Affects Versions: 1.3.0 >Reporter: PandaMonkey >Priority: Minor > Attachments: carbondata conflicts.txt > > > Hi, when we using org.apache.carbondata:carbondata-hive:1.3.0, we got > *NoSuchMethodError*. And by analyzing the source code, we found the root > cause is conflicting classes in different JARs. It means that duplicate > classes exist in different JARs, but they have different features, which > leads to the really loaded classes are not the actually required ones of our > project. (As JVM only load the classes present first on the classpath and > shadow the other duplicate ones with the same name.) And such conflictiing > problems exist in several JAR pairs dependent by carbondata-hive:1.3.0. The > detailed conflicting info is listed in the attachment. > Conflicting Jar-pairs: > jar-pair: > > jar-pair: > > jar-pair: > > jar-pair: > > jar-pair: > > jar-pair: > jar-pair: > > jar-pair: > > jar-pair: > jar-pair: > > jar-pair: > > jar-pair: > > jar-pair: > > jar-pair: > jar-pair: > jar-pair: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2000) Unable to save a dataframe result as carbondata streaming table
[ https://issues.apache.org/jira/browse/CARBONDATA-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-2000: Fix Version/s: (was: 1.4.0) > Unable to save a dataframe result as carbondata streaming table > --- > > Key: CARBONDATA-2000 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2000 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.3.0 > Environment: spark-2.1 >Reporter: anubhav tarar >Assignee: anubhav tarar >Priority: Trivial > > 1.create carbonsession > import org.apache.spark.sql.SparkSession > import org.apache.spark.sql.CarbonSession._ > val carbon = SparkSession.builder().config(sc.getConf) > .getOrCreateCarbonSession("hdfs://localhost:54311/newCarbonStore","/tmp" > 2.create a dataframe with carbonsession > import carbon.sqlContext.implicits._ > carbon.sql("drop table if exists streamingtable"); > val df =carbon.sparkContext.parallelize(1 to 5).toDF("colId") > 3.register dataframe as carbon streaming table > > df.write.format("carbondata").option("tableName","streamingTable").option("streaming","true").mode(SaveMode.Overwrite).save > 4,desc formatted the table > carbon.sql("describe formatted streamingTable").show(100) > ++++ > |col_name| data_type| comment| > ++++ > |colid...|int ...|MEASURE,null ...| > | ...| ...| ...| > |##Detailed Table ...| ...| ...| > |Database Name...|default ...| ...| > |Table Name ...|streamingtable ...| ...| > |CARBON Store Path...|hdfs://localhost:...| ...| > |Comment ...| ...| ...| > |Table Block Size ...|1024 MB ...| ...| > |Table Data Size ...|316 ...| ...| > |Table Index Size ...|283 ...| ...| > |Last Update Time ...|1515393447642...| ...| > |SORT_SCOPE ...|LOCAL_SORT ...|LOCAL_SORT ...| > |Streaming...|false...| ...| > |SORT_SCOPE ...|LOCAL_SORT ...|LOCAL_SORT ...| > | ...| ...| ...| > |##Detailed Column...| ...| ...| > |ADAPTIVE ...| ...| ...| > |SORT_COLUMNS ...| ...| ...| > ++++ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2332: [CARBONDATA-2514] Added condition to check for dupli...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2332 Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4908/ ---
[GitHub] carbondata issue #2335: [WIP] integrate carbonstore mv branch
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2335 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6071/ ---
[GitHub] carbondata pull request #2336: [CARBONDATA-2521] Support create carbonReader...
GitHub user xubo245 opened a pull request: https://github.com/apache/carbondata/pull/2336 [CARBONDATA-2521] Support create carbonReader without tableName Add new method for creating carbonReader without tableName Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? NA - [ ] Document update required? No - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. Add some test case - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. No You can merge this pull request into a Git repository by running: $ git pull https://github.com/xubo245/carbondata CARBONDATA-2521-supportReaderWithoutTableName Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2336.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2336 commit 80b4c9a724513d79439cbfcba62605648a4b5450 Author: xubo245Date: 2018-05-23T13:08:23Z [CARBONDATA-2521] Support create carbonReader without tableName ---
[GitHub] carbondata issue #2332: [CARBONDATA-2514] Added condition to check for dupli...
Github user kunal642 commented on the issue: https://github.com/apache/carbondata/pull/2332 retest this please ---
[GitHub] carbondata issue #2331: [CARBONDATA-2507] enable.offheap.sort not validate i...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2331 Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4905/ ---
[GitHub] carbondata issue #2314: [CARBONDATA-2309][DataLoad] Add strategy to generate...
Github user ndwangsen commented on the issue: https://github.com/apache/carbondata/pull/2314 @kumarvishal09 If the user specifies or default that the minimum data load of the node is less than the average data amount of each node, the existing strategy is used to handle ---
[jira] [Updated] (CARBONDATA-2099) Refactor on query scan process to improve readability
[ https://issues.apache.org/jira/browse/CARBONDATA-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-2099: Fix Version/s: (was: 1.4.0) > Refactor on query scan process to improve readability > - > > Key: CARBONDATA-2099 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2099 > Project: CarbonData > Issue Type: Improvement >Reporter: Jacky Li >Priority: Major > Time Spent: 5h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-1600) Carbon store abstraction
[ https://issues.apache.org/jira/browse/CARBONDATA-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-1600: Fix Version/s: (was: 1.4.0) NONE > Carbon store abstraction > > > Key: CARBONDATA-1600 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1600 > Project: CarbonData > Issue Type: Improvement >Reporter: Jacky Li >Assignee: Jacky Li >Priority: Major > Fix For: NONE > > > There should be a carbondata-store module to abstract all functionalities > that above file format, and provide developer API to support different > compute engine to integrate with carbon -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-1871) Add annotation for interface compatibility
[ https://issues.apache.org/jira/browse/CARBONDATA-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-1871: Fix Version/s: (was: 1.4.0) > Add annotation for interface compatibility > -- > > Key: CARBONDATA-1871 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1871 > Project: CarbonData > Issue Type: Improvement >Reporter: Jacky Li >Priority: Major > > All use facing API should be annotated with proper stability level. > InterfaceStability level includes: > 1. Forever: API in this level is compatible across major version > 2. Stable: API in this level is compatible across minor version, maybe break > across major version > 3. Evolving: API in this level is compatible across maintenance version, > maybe break across minor version > 4. Unstable: API in this level is not backward compatible guranteed > Since user mainly use SQL for carbondata, the API need to be annotated > includes: > 1. Table Property in create table > 2. Load Option in load data and dataframe api > 3. Carbon Property -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2159) Remove carbon-spark dependency for sdk module
[ https://issues.apache.org/jira/browse/CARBONDATA-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala resolved CARBONDATA-2159. - Resolution: Fixed > Remove carbon-spark dependency for sdk module > - > > Key: CARBONDATA-2159 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2159 > Project: CarbonData > Issue Type: Improvement >Reporter: Jacky Li >Priority: Major > Fix For: 1.4.0 > > Time Spent: 5h > Remaining Estimate: 0h > > store-sdk module should not depend on carbon-spark module -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2162) Remove spark dependency in carbon-core and carbon-processing module
[ https://issues.apache.org/jira/browse/CARBONDATA-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-2162: Fix Version/s: (was: 1.4.0) > Remove spark dependency in carbon-core and carbon-processing module > --- > > Key: CARBONDATA-2162 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2162 > Project: CarbonData > Issue Type: Improvement >Reporter: Jacky Li >Priority: Major > > The assembly JAR of store-sdk module should be small, but currently it > includes spark JAR because carbon-core, carbon-processing, carbon-hadoop > modules depends on spark. > This dependency should be removed -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-1825) Carbon 1.3.0 - Spark 2.2- Data load fails on 20k columns carbon table with CarbonDataWriterException
[ https://issues.apache.org/jira/browse/CARBONDATA-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-1825: Fix Version/s: (was: 1.4.0) > Carbon 1.3.0 - Spark 2.2- Data load fails on 20k columns carbon table with > CarbonDataWriterException > > > Key: CARBONDATA-1825 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1825 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.3.0 > Environment: Test - 3 node ant cluster >Reporter: Ramakrishna S >Assignee: kumar vishal >Priority: Minor > Labels: DFX > > Steps: > Beeline: > 1. Create carbon table with 20k columns > 2. Run table load > *+Expected:+* Table load should be success > *+Actual:+* table load fails -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2515) Filter OR Expression not working properly in Presto integration
[ https://issues.apache.org/jira/browse/CARBONDATA-2515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-2515: Fix Version/s: (was: 1.4.0) 1.4.1 > Filter OR Expression not working properly in Presto integration > --- > > Key: CARBONDATA-2515 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2515 > Project: CarbonData > Issue Type: Bug > Components: presto-integration >Affects Versions: 1.4.0 > Environment: Spark 2.1, Presto 0.187 >Reporter: Sourabh Verma >Priority: Major > Fix For: 1.4.1 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Scenario - carbon-data Table 'load_table' with columns 'integer', 'datetime' > //table creation and load code (spark) > val random = new Random() > val df = spark.sparkContext.parallelize(1 to (365 * 24 * 360)) > .map(x => (random.nextInt(200), new Timestamp(currentMillis - (x * 1000l > .toDF("integer", "datetime") > // Saves dataframe to carbondata file > df.write.format("carbondata") > .option("tableName", "load_table") > .option("compress", "true") > .option("tempCSV", "false") > .mode(SaveMode.Overwrite) > .save() > SQL (through Presto CLI) - select * from load_table where integer < 10 or > integer > 50; > Actual result - 0 rows. > Expected result - rows with integer value less than 10 and greater than 50. > cause - PrestoFilterUtil is creating AND Expressions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-1603) support user specified segments in major compaction
[ https://issues.apache.org/jira/browse/CARBONDATA-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-1603: Fix Version/s: (was: 1.4.0) > support user specified segments in major compaction > --- > > Key: CARBONDATA-1603 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1603 > Project: CarbonData > Issue Type: Improvement > Components: data-load >Affects Versions: 1.3.0 >Reporter: Zhoujin >Priority: Minor > Original Estimate: 72h > Remaining Estimate: 72h > > support user specified segments in major compaction > the proposed syntax: > ALTER TABLE [db_name].table_name COMPACT [SEGMENT seg_id1,seg_id2] 'MAJOR' > in which [SEGMENT seg_id1,seg_id2] is optional and compatible with original > syntax. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2337: [CARBONDATA-2519] Add document for CarbonReader
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2337 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6078/ ---
[GitHub] carbondata issue #2331: [CARBONDATA-2507] enable.offheap.sort not validate i...
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2331 @sraghunandan @rahulforallp This PR fix possible problem in #2274, Please review it. ---