[ 
https://issues.apache.org/jira/browse/HIVE-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867996#action_12867996
 ] 

Sachin Bochare commented on HIVE-678:
-------------------------------------

I was doing some performance analysis of indexing and was using this indexing 
patch. I found three issues in the patch and fixed those temporary to move 
ahead with my experiments.

The three issues are:
# There was a regression introduced in selecting data from table. Simple select 
queries were returning 0 rows. The problem was that temporary out path used by 
Hive were not properly set.
# The index records were off by one. The offset of next index value was shown 
in current index value.
# The file name and first offset value was not separated with a delimiter. 

I will send out my experiment results in a day or two to you. 

Following is the patch to fix those three issues. This needs to be appied after 
applying the attached patch(hive-678-2009-07-25.patch).

{code}
diff -ur 
Hive-796926-patch-HIVE-678/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java
 
Hive-796926-patch-HIVE-678-Modified/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java
--- 
Hive-796926-patch-HIVE-678/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java
       2010-05-16 20:05:58.000000000 +0530
+++ 
Hive-796926-patch-HIVE-678-Modified/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java
      2010-05-16 20:10:56.000000000 +0530
@@ -348,8 +348,12 @@
     }

     String hiveScratchDir = getScratchDir();
-    Path jobScratchDir = new Path(hiveScratchDir);
-    Path outPath = new Path(getOutputPath());
+    Path jobScratchDir = new Path(hiveScratchDir + 
Utilities.randGen.nextInt());
+    Path outPath = jobScratchDir;
+    if (outputPath != null) {
+        outPath = new Path(outputPath);
+    }
+
     if (isDelOutputIfExists()) {
       try {
         FileSystem outFs = outPath.getFileSystem(job);
diff -ur 
Hive-796926-patch-HIVE-678/ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexRecordReader.java
 
Hive-796926-patch-HIVE-678-Modified/ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexRecordReader.java
--- 
Hive-796926-patch-HIVE-678/ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexRecordReader.java
   2010-05-16 20:05:58.000000000 +0530
+++ 
Hive-796926-patch-HIVE-678-Modified/ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexRecordReader.java
  2010-05-16 20:11:38.000000000 +0530
@@ -64,7 +64,6 @@
    * to upper.
    */
   public boolean next(Object key, Object value) throws IOException {
-    boolean result = rawReader.next(rawKey, value);
     if (!blockCompressed) {
       try {
         ((LongWritable) key).set(rawReader.getPos());
@@ -74,6 +73,7 @@
     } else {
       ((LongWritable) key).set(blockStart);
     }
+    boolean result = rawReader.next(rawKey, value);
     return result;
   }

diff -ur 
Hive-796926-patch-HIVE-678/ql/src/java/org/apache/hadoop/hive/ql/index/IndexBuilderCompactSumReducer.java
 
Hive-796926-patch-HIVE-678-Modified/ql/src/java/org/apache/hadoop/hive/ql/index/IndexBuilderCompactSumReducer.java
--- 
Hive-796926-patch-HIVE-678/ql/src/java/org/apache/hadoop/hive/ql/index/IndexBuilderCompactSumReducer.java
   2010-05-16 20:05:58.000000000 +0530
+++ 
Hive-796926-patch-HIVE-678-Modified/ql/src/java/org/apache/hadoop/hive/ql/index/IndexBuilderCompactSumReducer.java
  2010-05-16 20:12:36.000000000 +0530
@@ -95,6 +95,7 @@
     SortedSet<Long> poses = bucketOffsetMap.get(bucketName);
     Iterator<Long> posIter = poses.iterator();
     if (posIter.hasNext()) {
+      bl.append(HiveIndex.BUCKET_POS_VAL_SEPARATOR);
       bl.append(posIter.next());
     }
     while (posIter.hasNext()) {
{code}

Thanks,
Sachin


> Add support for building index table
> ------------------------------------
>
>                 Key: HIVE-678
>                 URL: https://issues.apache.org/jira/browse/HIVE-678
>             Project: Hadoop Hive
>          Issue Type: Sub-task
>          Components: Metastore, Query Processor
>    Affects Versions: 0.3.0, 0.3.1, 0.4.0, 0.6.0
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: hive-678-2009-07-25.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to