[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-carbondata/pull/523 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
GitHub user ravikiran23 reopened a pull request: https://github.com/apache/incubator-carbondata/pull/523 [CARBONDATA-440] fixing no kettle issue for IUD. For iud data load flow will be used. so in the case of NO-KETTLE, need to handle data load. load count/ segment count should be string because in compaction case it will be 2.1 You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravikiran23/incubator-carbondata IUD-NO-KETTLE Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/523.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #523 commit 3cd90280c9b7f1ccb7d7e42dbf2ecab37419799f Author: ravikiranDate: 2017-01-09T13:28:13Z fixing no kettle issue for IUD. load count/ segment count should be string because in compaction case it will be 2.1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
Github user ravikiran23 closed the pull request at: https://github.com/apache/incubator-carbondata/pull/523 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
Github user ravikiran23 commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/523#discussion_r96169571 --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -719,16 +720,50 @@ object CarbonDataRDDFactory { loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY + UUID.randomUUID().toString + if (useKettle) { +try { + RddInpututilsForUpdate.put(rddIteratorKey, +new RddIteratorForUpdate(iter, carbonLoadModel)) + carbonLoadModel.setRddIteratorKey(rddIteratorKey) + CarbonDataLoadForUpdate +.run(carbonLoadModel, index, storePath, kettleHomePath, + segId, loadMetadataDetails, executionErrors) +} finally { + RddInpututilsForUpdate.remove(rddIteratorKey) +} + } else { +try { + val recordReaders = mutable.Buffer[CarbonIterator[Array[AnyRef]]]() + val serializer = SparkEnv.get.closureSerializer.newInstance() + var serializeBuffer: ByteBuffer = null +recordReaders += new CarbonIteratorImpl( + new NewRddIterator(iter, +carbonLoadModel, +TaskContext.get())) + + val loader = new SparkPartitionLoader(carbonLoadModel, +index, +null, +null, +segId, +loadMetadataDetails) + // Intialize to set carbon properties + loader.initialize() + + loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) + new DataLoadExecutor() +.execute(carbonLoadModel, loader.storeLocation, recordReaders.toArray) --- End diff -- fixed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
Github user ravikiran23 commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/523#discussion_r96169581 --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -719,16 +720,50 @@ object CarbonDataRDDFactory { loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY + UUID.randomUUID().toString + if (useKettle) { +try { + RddInpututilsForUpdate.put(rddIteratorKey, +new RddIteratorForUpdate(iter, carbonLoadModel)) + carbonLoadModel.setRddIteratorKey(rddIteratorKey) + CarbonDataLoadForUpdate +.run(carbonLoadModel, index, storePath, kettleHomePath, + segId, loadMetadataDetails, executionErrors) +} finally { + RddInpututilsForUpdate.remove(rddIteratorKey) +} + } else { +try { + val recordReaders = mutable.Buffer[CarbonIterator[Array[AnyRef]]]() + val serializer = SparkEnv.get.closureSerializer.newInstance() + var serializeBuffer: ByteBuffer = null +recordReaders += new CarbonIteratorImpl( + new NewRddIterator(iter, +carbonLoadModel, +TaskContext.get())) + + val loader = new SparkPartitionLoader(carbonLoadModel, +index, +null, --- End diff -- fixed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
Github user ravikiran23 commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/523#discussion_r96169576 --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -719,16 +720,50 @@ object CarbonDataRDDFactory { loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY + UUID.randomUUID().toString + if (useKettle) { +try { + RddInpututilsForUpdate.put(rddIteratorKey, +new RddIteratorForUpdate(iter, carbonLoadModel)) + carbonLoadModel.setRddIteratorKey(rddIteratorKey) + CarbonDataLoadForUpdate +.run(carbonLoadModel, index, storePath, kettleHomePath, + segId, loadMetadataDetails, executionErrors) +} finally { + RddInpututilsForUpdate.remove(rddIteratorKey) +} + } else { +try { + val recordReaders = mutable.Buffer[CarbonIterator[Array[AnyRef]]]() + val serializer = SparkEnv.get.closureSerializer.newInstance() + var serializeBuffer: ByteBuffer = null +recordReaders += new CarbonIteratorImpl( + new NewRddIterator(iter, +carbonLoadModel, +TaskContext.get())) + + val loader = new SparkPartitionLoader(carbonLoadModel, +index, +null, +null, +segId, +loadMetadataDetails) + // Intialize to set carbon properties + loader.initialize() + + loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) + new DataLoadExecutor() +.execute(carbonLoadModel, loader.storeLocation, recordReaders.toArray) + +} catch { + case e: BadRecordFoundException => +loadMetadataDetails + .setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_PARTIAL_SUCCESS) --- End diff -- fixed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
Github user ravikiran23 commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/523#discussion_r96169584 --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -719,16 +720,50 @@ object CarbonDataRDDFactory { loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY + UUID.randomUUID().toString + if (useKettle) { +try { + RddInpututilsForUpdate.put(rddIteratorKey, +new RddIteratorForUpdate(iter, carbonLoadModel)) + carbonLoadModel.setRddIteratorKey(rddIteratorKey) + CarbonDataLoadForUpdate +.run(carbonLoadModel, index, storePath, kettleHomePath, --- End diff -- fixed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
Github user ravikiran23 commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/523#discussion_r96169559 --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -719,16 +720,50 @@ object CarbonDataRDDFactory { loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY + UUID.randomUUID().toString + if (useKettle) { +try { --- End diff -- fixed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/523#discussion_r95920952 --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -719,16 +720,50 @@ object CarbonDataRDDFactory { loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY + UUID.randomUUID().toString + if (useKettle) { +try { + RddInpututilsForUpdate.put(rddIteratorKey, +new RddIteratorForUpdate(iter, carbonLoadModel)) + carbonLoadModel.setRddIteratorKey(rddIteratorKey) + CarbonDataLoadForUpdate +.run(carbonLoadModel, index, storePath, kettleHomePath, + segId, loadMetadataDetails, executionErrors) +} finally { + RddInpututilsForUpdate.remove(rddIteratorKey) +} + } else { +try { + val recordReaders = mutable.Buffer[CarbonIterator[Array[AnyRef]]]() + val serializer = SparkEnv.get.closureSerializer.newInstance() + var serializeBuffer: ByteBuffer = null +recordReaders += new CarbonIteratorImpl( + new NewRddIterator(iter, +carbonLoadModel, +TaskContext.get())) + + val loader = new SparkPartitionLoader(carbonLoadModel, +index, +null, --- End diff -- You are following different code style, can you make the style like other code? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/523#discussion_r95920765 --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -719,16 +720,50 @@ object CarbonDataRDDFactory { loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY + UUID.randomUUID().toString + if (useKettle) { +try { + RddInpututilsForUpdate.put(rddIteratorKey, +new RddIteratorForUpdate(iter, carbonLoadModel)) + carbonLoadModel.setRddIteratorKey(rddIteratorKey) + CarbonDataLoadForUpdate +.run(carbonLoadModel, index, storePath, kettleHomePath, + segId, loadMetadataDetails, executionErrors) +} finally { + RddInpututilsForUpdate.remove(rddIteratorKey) +} + } else { +try { + val recordReaders = mutable.Buffer[CarbonIterator[Array[AnyRef]]]() + val serializer = SparkEnv.get.closureSerializer.newInstance() + var serializeBuffer: ByteBuffer = null +recordReaders += new CarbonIteratorImpl( + new NewRddIterator(iter, +carbonLoadModel, +TaskContext.get())) + + val loader = new SparkPartitionLoader(carbonLoadModel, +index, +null, +null, +segId, +loadMetadataDetails) + // Intialize to set carbon properties + loader.initialize() + + loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) + new DataLoadExecutor() +.execute(carbonLoadModel, loader.storeLocation, recordReaders.toArray) --- End diff -- move to previous line, break the line at parameter list --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/523#discussion_r95920779 --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -719,16 +720,50 @@ object CarbonDataRDDFactory { loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY + UUID.randomUUID().toString + if (useKettle) { +try { + RddInpututilsForUpdate.put(rddIteratorKey, +new RddIteratorForUpdate(iter, carbonLoadModel)) + carbonLoadModel.setRddIteratorKey(rddIteratorKey) + CarbonDataLoadForUpdate +.run(carbonLoadModel, index, storePath, kettleHomePath, + segId, loadMetadataDetails, executionErrors) +} finally { + RddInpututilsForUpdate.remove(rddIteratorKey) +} + } else { +try { + val recordReaders = mutable.Buffer[CarbonIterator[Array[AnyRef]]]() + val serializer = SparkEnv.get.closureSerializer.newInstance() + var serializeBuffer: ByteBuffer = null +recordReaders += new CarbonIteratorImpl( + new NewRddIterator(iter, +carbonLoadModel, +TaskContext.get())) + + val loader = new SparkPartitionLoader(carbonLoadModel, +index, +null, +null, +segId, +loadMetadataDetails) + // Intialize to set carbon properties + loader.initialize() + + loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) + new DataLoadExecutor() +.execute(carbonLoadModel, loader.storeLocation, recordReaders.toArray) + +} catch { + case e: BadRecordFoundException => +loadMetadataDetails + .setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_PARTIAL_SUCCESS) --- End diff -- move to previous line, break the line at parameter list --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/523#discussion_r95920624 --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -719,16 +720,50 @@ object CarbonDataRDDFactory { loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY + UUID.randomUUID().toString + if (useKettle) { +try { --- End diff -- move `try` to `CarbonDataLoadForUpdate.run` only, we should limit the try scope, do the same for next `try` also. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
Github user ravikiran23 commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/523#discussion_r95752235 --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -719,16 +720,51 @@ object CarbonDataRDDFactory { loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY + UUID.randomUUID().toString + if (useKettle) { +try { + RddInpututilsForUpdate.put(rddIteratorKey, +new RddIteratorForUpdate(iter, carbonLoadModel)) + carbonLoadModel.setRddIteratorKey(rddIteratorKey) + CarbonDataLoadForUpdate +.run(carbonLoadModel, index, storePath, kettleHomePath, + segId, loadMetadataDetails, executionErrors) +} finally { + RddInpututilsForUpdate.remove(rddIteratorKey) +} + } + else { --- End diff -- fixed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
Github user ravikiran23 commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/523#discussion_r95751767 --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -719,16 +720,51 @@ object CarbonDataRDDFactory { loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY + UUID.randomUUID().toString + if (useKettle) { --- End diff -- as of now IUD is supported in 1.6.2. support is not there for 2.1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/523#discussion_r95704439 --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -719,16 +720,51 @@ object CarbonDataRDDFactory { loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY + UUID.randomUUID().toString + if (useKettle) { +try { + RddInpututilsForUpdate.put(rddIteratorKey, +new RddIteratorForUpdate(iter, carbonLoadModel)) + carbonLoadModel.setRddIteratorKey(rddIteratorKey) + CarbonDataLoadForUpdate +.run(carbonLoadModel, index, storePath, kettleHomePath, + segId, loadMetadataDetails, executionErrors) +} finally { + RddInpututilsForUpdate.remove(rddIteratorKey) +} + } + else { --- End diff -- move to previous line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---