Review Request: HIVE-1948
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/398/ --- Review request for hive. Summary --- Adds audit logging to the metastore This addresses bug HIVE-1948. https://issues.apache.org/jira/browse/HIVE-1948 Diffs - http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1067080 Diff: https://reviews.apache.org/r/398/diff Testing --- Tested manually and checked the audit logs appear in the configured log file. Thanks, Devaraj
[jira] Commented: (HIVE-1948) Have audit logging in the Metastore
[ https://issues.apache.org/jira/browse/HIVE-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12991189#comment-12991189 ] Devaraj Das commented on HIVE-1948: --- https://reviews.apache.org/r/398/ is the reviewboard URL Have audit logging in the Metastore --- Key: HIVE-1948 URL: https://issues.apache.org/jira/browse/HIVE-1948 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.7.0 Attachments: audit-log.1.patch, audit-log.patch It would be good to have audit logging in the metastore, similar to Hadoop's NameNode audit logging. This would allow administrators to dig into details about which user performed metadata operations (like create/drop tables/partitions) and from where (IP address). -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Assigned: (HIVE-1900) a mapper should be able to span multiple partitions
[ https://issues.apache.org/jira/browse/HIVE-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain reassigned HIVE-1900: Assignee: Namit Jain (was: Krishna Kumar) a mapper should be able to span multiple partitions --- Key: HIVE-1900 URL: https://issues.apache.org/jira/browse/HIVE-1900 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.1900.1.patch, hive.1900.2.patch Currently, a mapper only spans a single partition which creates a problem in the presence of many small partitions (which is becoming a common usecase in facebook). If the plan is the same, a mapper should be able to span files across multiple partitions -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HIVE-1961) Make Stats gathering more flexible with timeout and atomicity
[ https://issues.apache.org/jira/browse/HIVE-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-1961: - Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed. Thanks Ning Make Stats gathering more flexible with timeout and atomicity - Key: HIVE-1961 URL: https://issues.apache.org/jira/browse/HIVE-1961 Project: Hive Issue Type: Improvement Reporter: Ning Zhang Assignee: Ning Zhang Attachments: HIVE-1961.patch The JDBC-type stats publisher and and aggregator use a hard-coded timeout value. We should make it configurable. Also we should introduce a hive variable to allow atomic stats gathering (metastore stats are updated only if all types of stats are made available). -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Hudson: Hive-trunk-h0.20 #538
See https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/538/changes Changes: [namit] HIVE-1961 Make Stats gathering more flexible with timeout and atomicity (Ning Zhang via namit) -- [...truncated 19174 lines...] [junit] POSTHOOK: Output: default@srcbucket [junit] OK [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket1.txt' INTO TABLE srcbucket [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket1.txt [junit] Loading data to table srcbucket [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket1.txt' INTO TABLE srcbucket [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@srcbucket [junit] OK [junit] PREHOOK: query: CREATE TABLE srcbucket2(key int, value string) CLUSTERED BY (key) INTO 4 BUCKETS STORED AS TEXTFILE [junit] PREHOOK: type: CREATETABLE [junit] POSTHOOK: query: CREATE TABLE srcbucket2(key int, value string) CLUSTERED BY (key) INTO 4 BUCKETS STORED AS TEXTFILE [junit] POSTHOOK: type: CREATETABLE [junit] POSTHOOK: Output: default@srcbucket2 [junit] OK [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket20.txt' INTO TABLE srcbucket2 [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket20.txt [junit] Loading data to table srcbucket2 [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket20.txt' INTO TABLE srcbucket2 [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@srcbucket2 [junit] OK [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket21.txt' INTO TABLE srcbucket2 [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket21.txt [junit] Loading data to table srcbucket2 [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket21.txt' INTO TABLE srcbucket2 [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@srcbucket2 [junit] OK [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket22.txt' INTO TABLE srcbucket2 [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket22.txt [junit] Loading data to table srcbucket2 [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket22.txt' INTO TABLE srcbucket2 [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@srcbucket2 [junit] OK [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket23.txt' INTO TABLE srcbucket2 [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket23.txt [junit] Loading data to table srcbucket2 [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket23.txt' INTO TABLE srcbucket2 [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@srcbucket2 [junit] OK [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt' INTO TABLE src [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt [junit] Loading data to table src [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt' INTO TABLE src [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@src [junit] OK [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv3.txt' INTO TABLE src1 [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv3.txt [junit] Loading data to table src1 [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv3.txt' INTO TABLE src1 [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@src1
[jira] Updated: (HIVE-1803) Implement bitmap indexing in Hive
[ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marquis Wang updated HIVE-1803: --- Attachment: HIVE-1803.2.patch We fixed the problem in BitmapCollectSet by looking at the PercentileApprox UDAF to figure out how to use an array an input to a UDAF. This new patch is a working implementation of bitmap indexing. The new test index_bitmap.q shows how to use the index. However, I am unable to add the test itself, and get errors when I run ant test -Dtestcase=TestCliDriver -Dqfile=index_bitmap.q -Doverwrite=true -Dtest.silent=false It says Exception: java.lang.RuntimeException: The table default__srcpart_srcpart_index_proj__ is an index table. Please do drop index instead. wrt to the ALTER INDEX REBUILD line in the test. We're pretty confused about whether we're doing the new test incorrectly and would appreciate any help. While we're working to get around that we're also going to go ahead and work on a compressed bitmap, since this implementation does no compression. Implement bitmap indexing in Hive - Key: HIVE-1803 URL: https://issues.apache.org/jira/browse/HIVE-1803 Project: Hive Issue Type: New Feature Reporter: Marquis Wang Assignee: Marquis Wang Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, bitmap_index_1.png, bitmap_index_2.png Implement bitmap index handler to complement compact indexing. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system
[ https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12991282#comment-12991282 ] Paul Yang commented on HIVE-1918: - Looking at it as well.. Add export/import facilities to the hive system --- Key: HIVE-1918 URL: https://issues.apache.org/jira/browse/HIVE-1918 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Krishna Kumar Assignee: Krishna Kumar Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, HIVE-1918.patch.3.txt, HIVE-1918.patch.txt, hive-metastore-er.pdf This is an enhancement request to add export/import features to hive. With this language extension, the user can export the data of the table - which may be located in different hdfs locations in case of a partitioned table - as well as the metadata of the table into a specified output location. This output location can then be moved over to another different hadoop/hive instance and imported there. This should work independent of the source and target metastore dbms used; for instance, between derby and mysql. For partitioned tables, the ability to export/import a subset of the partition must be supported. Howl will add more features on top of this: The ability to create/use the exported data even in the absence of hive, using MR or Pig. Please see http://wiki.apache.org/pig/Howl/HowlImportExport for these details. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira