[GitHub] carbondata pull request #1019: [CARBONDATA-1156]Improve IUD performance and ...

2017-06-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1019


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1019: [CARBONDATA-1156]Improve IUD performance and ...

2017-06-12 Thread kumarvishal09
Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1019#discussion_r121399194
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/result/iterator/AbstractDetailQueryResultIterator.java
 ---
@@ -126,6 +144,82 @@ private void intialiseInfos() {
 }
   }
 
+  /**
+   * Below method will be used to get the delete delta rows for a block
+   *
+   * @param dataBlock   data block
+   * @param deleteDeltaInfo delete delta info
+   * @return blockid+pageid to deleted row mapping
+   */
+  private Map getDeleteDeltaDetails(AbstractIndex 
dataBlock,
+  DeleteDeltaInfo deleteDeltaInfo) {
+// if datablock deleted delta timestamp is more then the current 
delete delta files timestamp
+// then return the current deleted rows
+if (dataBlock.getDeleteDeltaTimestamp() >= deleteDeltaInfo
+.getLatestDeleteDeltaFileTimestamp()) {
+  return dataBlock.getDeletedRowsMap();
+}
+CarbonDeleteFilesDataReader carbonDeleteDeltaFileReader = null;
+// get the lock object so in case of concurrent query only one task 
will read the delete delta
+// files other tasks will wait
+Object lockObject = deleteDeltaToLockObjectMap.get(deleteDeltaInfo);
+// if lock object is null then add a lock object
+if (null == lockObject) {
+  synchronized (deleteDeltaToLockObjectMap) {
+// double checking
--- End diff --

ok. I missed it:)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1019: [CARBONDATA-1156]Improve IUD performance and ...

2017-06-12 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1019#discussion_r121395234
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/result/iterator/AbstractDetailQueryResultIterator.java
 ---
@@ -126,6 +144,82 @@ private void intialiseInfos() {
 }
   }
 
+  /**
+   * Below method will be used to get the delete delta rows for a block
+   *
+   * @param dataBlock   data block
+   * @param deleteDeltaInfo delete delta info
+   * @return blockid+pageid to deleted row mapping
+   */
+  private Map getDeleteDeltaDetails(AbstractIndex 
dataBlock,
+  DeleteDeltaInfo deleteDeltaInfo) {
+// if datablock deleted delta timestamp is more then the current 
delete delta files timestamp
+// then return the current deleted rows
+if (dataBlock.getDeleteDeltaTimestamp() >= deleteDeltaInfo
+.getLatestDeleteDeltaFileTimestamp()) {
+  return dataBlock.getDeletedRowsMap();
+}
+CarbonDeleteFilesDataReader carbonDeleteDeltaFileReader = null;
+// get the lock object so in case of concurrent query only one task 
will read the delete delta
+// files other tasks will wait
+Object lockObject = deleteDeltaToLockObjectMap.get(deleteDeltaInfo);
+// if lock object is null then add a lock object
+if (null == lockObject) {
+  synchronized (deleteDeltaToLockObjectMap) {
+// double checking
--- End diff --

Again do `deleteDeltaToLockObjectMap.get(deleteDeltaInfo);` to avoid null 
pointer exception


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1019: [CARBONDATA-1156]Improve IUD performance and ...

2017-06-12 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1019#discussion_r121390830
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/reader/CarbonDeleteFilesDataReader.java
 ---
@@ -120,7 +122,53 @@ private void initThreadPoolSize() {
   }
 }
 return pageIdDeleteRowsMap;
+  }
 
+  /**
+   * Below method will be used to read the delete delta files
+   * and get the map of blockletid and page id mapping to deleted
+   * rows
+   *
+   * @param deltaFiles delete delta files array
+   * @return map of blockletid_pageid to deleted rows
+   */
+  public Map getDeletedRowsDataVo(String[] 
deltaFiles) {
+List taskSubmitList = new 
ArrayList<>();
+ExecutorService executorService = 
Executors.newFixedThreadPool(thread_pool_size);
+for (final String deltaFile : deltaFiles) {
+  taskSubmitList.add(executorService.submit(new 
Callable() {
+@Override public DeleteDeltaBlockDetails call() throws IOException 
{
+  CarbonDeleteDeltaFileReaderImpl deltaFileReader =
+  new CarbonDeleteDeltaFileReaderImpl(deltaFile, 
FileFactory.getFileType(deltaFile));
+  return deltaFileReader.readJson();
+}
+  }));
+}
+try {
+  executorService.shutdown();
+  executorService.awaitTermination(30, TimeUnit.MINUTES);
+} catch (InterruptedException e) {
+  LOGGER.error("Error while reading the delete delta files : " + 
e.getMessage());
+}
+Map pageIdToBlockLetVo = new HashMap<>();
+List blockletDetails = null;
+for (int i = 0; i < taskSubmitList.size(); i++) {
+  try {
+blockletDetails = taskSubmitList.get(i).get().getBlockletDetails();
+  } catch (InterruptedException | ExecutionException e) {
+throw new RuntimeException(e);
+  }
+  for (DeleteDeltaBlockletDetails blockletDetail : blockletDetails) {
+DeleteDeltaVo deleteDeltaVo = 
pageIdToBlockLetVo.get(blockletDetail.getBlockletKey());
+if (null == deleteDeltaVo) {
+  deleteDeltaVo = new DeleteDeltaVo();
+  pageIdToBlockLetVo.put(blockletDetail.getBlockletKey(), 
deleteDeltaVo);
+}
+deleteDeltaVo.insertData(blockletDetail.getDeletedRows());
+;
--- End diff --

remove semicolon


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1019: [CARBONDATA-1156]Improve IUD performance and ...

2017-06-12 Thread kumarvishal09
GitHub user kumarvishal09 opened a pull request:

https://github.com/apache/carbondata/pull/1019

[CARBONDATA-1156]Improve IUD performance and fixed synchronization issue

Delete delta file loading is taking more time as it is read for blocklet 
level. Now added code to read block level.
In current IUD design delete delta files are getting listed for each block 
in executor level in case of parallel query and iud operation it may give wrong 
result. Now passing delete delta information from driver to executor

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kumarvishal09/incubator-carbondata 
IUDPerformanceImprovement

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1019.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1019


commit 60cfc66fe1f2de4cc3c2395a4dd479abb2a602f4
Author: kumarvishal 
Date:   2017-06-12T10:36:24Z

Fixed Syncronization issue and improve IUD performance




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---