[jira] [Updated] (ATLAS-844) Remove titan berkeley and elastic search jars if hbase/solr based profiles are chosen
[ https://issues.apache.org/jira/browse/ATLAS-844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hemanth Yamijala updated ATLAS-844: --- Attachment: ATLAS-844.patch The attached patch does the following: * When the profile selected for packaging is anything other than {{berkeley-elasticsearch}}, it excludes the Berkeley DB Java edition jar, elastic search jar, and the corresponding titan adapters. * When the profile selected is {{berkeley-elasticsearch}}, it excludes only the Berkeley DB Java edition jar. * I think the exclusion may be required because the BerkeleyDB java edition has the Sleepy Cat license (http://www.oracle.com/technetwork/database/berkeleydb/downloads/jeoslicense-086837.html, https://opensource.org/licenses/Sleepycat) which seems to be incompatible for distribution with the Apache Software license. Some reference to Titan mailing list documentation on this: https://groups.google.com/d/msg/aureliusgraphs/5zF6zzGRFEs/igecqgkAOqkJ * Adds documentation steps for how to get the BerkeleyDB jar if required. (I used the documentation steps from Apache Falcon which has similar needs, I think). Note the following: * I have only removed the bundling of the jars in Atlas distribution. AFAIK, that's the only requirement. * The titan adapters are covered under Apache license, hence they can be redistributed. * The involved jars were a part of the Atlas server war file, hence the changes to {{maven-war-plugin}}. I tested this with * {{mvn clean install}} - runs all tests and passes * {{mvn clean package -Pdist}} - default profile (pointing to external hbase and solr, hence removes the other jars) * {{mvn clean package -Pdist,berkeley-elasticsearch}} (includes the berkeley titan adapter, ES jar, and ES titan adapter). Copied the berkeley JE jar to ${atlas_home}/extlib and server starts up fine. > Remove titan berkeley and elastic search jars if hbase/solr based profiles > are chosen > - > > Key: ATLAS-844 > URL: https://issues.apache.org/jira/browse/ATLAS-844 > Project: Atlas > Issue Type: Bug >Affects Versions: 0.7-incubating >Reporter: Hemanth Yamijala > Fix For: 0.7-incubating > > Attachments: ATLAS-844.patch > > > With ATLAS-833, users of Atlas now have the option of using an external > HBase/Solr installation, an self-contained HBase/Solr installation (embedded > mode) or BerkeleyDB/Elastic Search installation. > When choosing either of the first two modes, we can potentially remove the > Titan berkeley DB or elastic search jars. This helps distributions which have > restrictions on using these jars from a contractual perspective. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ATLAS-904) Hive hook fails due to session state not being set
[ https://issues.apache.org/jira/browse/ATLAS-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suma Shivaprasad updated ATLAS-904: --- Attachment: ATLAS-904.2.patch Changes to address [~yhemanth] review comments. 1. Process qualified name = HiveOperation.name + sorted inputs + sorted outputs 2. HiveOperation.name doesnt provide identifiers for identiifying INSERT, INSERT_OVERWRITE, UPDATE, DELETE etc separately . Hence adding WriteEntity.WriteType as well which exhibits the following behaviour a. If there are multiple outputs, for each output, adds the query type(WriteType) b. if query being run if is type INSERT [into/overwrite] TABLE [PARTITION], WriteType is INSERT/INSERT_OVERWRITE b. If query is of type INSERT OVERWRITE hdfs_path, adds WriteType as PATH_WRITE c. If query is of type UPDATE/DELETE, adds type as UPDATE/DELETE [ Note - linage is not available for this since this is single table operation] > Hive hook fails due to session state not being set > -- > > Key: ATLAS-904 > URL: https://issues.apache.org/jira/browse/ATLAS-904 > Project: Atlas > Issue Type: Bug >Affects Versions: 0.7-incubating >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Blocker > Fix For: 0.7-incubating > > Attachments: ATLAS-904.1.patch, ATLAS-904.2.patch, ATLAS-904.patch > > > {noformat} > 2016-06-15 11:34:30,423 WARN [Atlas Logger 0]: hook.HiveHook > (HiveHook.java:normalize(557)) - Could not rewrite query due to error. > Proceeding with original query EXPORT TABLE test_export_table to > 'hdfs://localhost:9000/hive_tables/test_path1' > java.lang.NullPointerException: Conf non-local session path expected to be > non-null > at > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204) > at > org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:641) > at org.apache.hadoop.hive.ql.Context.(Context.java:133) > at org.apache.hadoop.hive.ql.Context.(Context.java:120) > at > org.apache.atlas.hive.rewrite.HiveASTRewriter.(HiveASTRewriter.java:44) > at org.apache.atlas.hive.hook.HiveHook.normalize(HiveHook.java:554) > at > org.apache.atlas.hive.hook.HiveHook.getProcessReferenceable(HiveHook.java:702) > at > org.apache.atlas.hive.hook.HiveHook.registerProcess(HiveHook.java:596) > at org.apache.atlas.hive.hook.HiveHook.fireAndForget(HiveHook.java:222) > at org.apache.atlas.hive.hook.HiveHook.access$200(HiveHook.java:77) > at org.apache.atlas.hive.hook.HiveHook$2.run(HiveHook.java:182) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-06-15 11:34:30,423 ERROR [Atlas Logger 0]: hook.HiveHook > (HiveHook.java:run(184)) - Atlas hook failed due to error > java.lang.NullPointerException > at java.lang.StringBuilder.(StringBuilder.java:109) > at > org.apache.atlas.hive.hook.HiveHook.getProcessQualifiedName(HiveHook.java:738) > at > org.apache.atlas.hive.hook.HiveHook.getProcessReferenceable(HiveHook.java:703) > at > org.apache.atlas.hive.hook.HiveHook.registerProcess(HiveHook.java:596) > at org.apache.atlas.hive.hook.HiveHook.fireAndForget(HiveHook.java:222) > at org.apache.atlas.hive.hook.HiveHook.access$200(HiveHook.java:77) > at org.apache.atlas.hive.hook.HiveHook$2.run(HiveHook.java:182) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (ATLAS-904) Hive hook fails due to session state not being set
[ https://issues.apache.org/jira/browse/ATLAS-904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338949#comment-15338949 ] Suma Shivaprasad edited comment on ATLAS-904 at 6/20/16 3:49 AM: - Changes to address [~yhemanth] review comments. 1. Process qualified name = HiveOperation.name + sorted inputs + sorted outputs 2. HiveOperation.name doesnt provide identifiers for identiifying INSERT, INSERT_OVERWRITE, UPDATE, DELETE etc separately . Hence adding WriteEntity.WriteType as well which exhibits the following behaviour a. If there are multiple outputs, for each output, adds the query type(WriteType) b. if query being run if is type INSERT [into/overwrite] TABLE [PARTITION], WriteType is INSERT/INSERT_OVERWRITE b. If query is of type INSERT OVERWRITE hdfs_path, adds WriteType as PATH_WRITE c. If query is of type UPDATE/DELETE, adds type as UPDATE/DELETE [ Note - linage is not available for this since this is single table operation] 3.When input is of type local dir or hdfs path currently, it doesnt add it to qualified name. The reason is that partition based paths cause a lot of processes to be created in this case instead of updating the same process. Pending: Address [~shwethags] suggestion to add hdfs paths to process qualified name only in case of non-partition based queries. This needs to be done per HiveOperation type 1. if HiveOperation = LOAD, IMPORT, EXPORT - detect if the current query context is dealing with partitions and do not add if it is partition based. 2. If HiveOperation = INSERT OVERWRITE DFS_PATH/LOCAL_PATH , then detect if the query context is dealing with a partitioned table in inputs and decide if we need to add or not. was (Author: suma.shivaprasad): Changes to address [~yhemanth] review comments. 1. Process qualified name = HiveOperation.name + sorted inputs + sorted outputs 2. HiveOperation.name doesnt provide identifiers for identiifying INSERT, INSERT_OVERWRITE, UPDATE, DELETE etc separately . Hence adding WriteEntity.WriteType as well which exhibits the following behaviour a. If there are multiple outputs, for each output, adds the query type(WriteType) b. if query being run if is type INSERT [into/overwrite] TABLE [PARTITION], WriteType is INSERT/INSERT_OVERWRITE b. If query is of type INSERT OVERWRITE hdfs_path, adds WriteType as PATH_WRITE c. If query is of type UPDATE/DELETE, adds type as UPDATE/DELETE [ Note - linage is not available for this since this is single table operation] > Hive hook fails due to session state not being set > -- > > Key: ATLAS-904 > URL: https://issues.apache.org/jira/browse/ATLAS-904 > Project: Atlas > Issue Type: Bug >Affects Versions: 0.7-incubating >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Blocker > Fix For: 0.7-incubating > > Attachments: ATLAS-904.1.patch, ATLAS-904.2.patch, ATLAS-904.patch > > > {noformat} > 2016-06-15 11:34:30,423 WARN [Atlas Logger 0]: hook.HiveHook > (HiveHook.java:normalize(557)) - Could not rewrite query due to error. > Proceeding with original query EXPORT TABLE test_export_table to > 'hdfs://localhost:9000/hive_tables/test_path1' > java.lang.NullPointerException: Conf non-local session path expected to be > non-null > at > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204) > at > org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:641) > at org.apache.hadoop.hive.ql.Context.(Context.java:133) > at org.apache.hadoop.hive.ql.Context.(Context.java:120) > at > org.apache.atlas.hive.rewrite.HiveASTRewriter.(HiveASTRewriter.java:44) > at org.apache.atlas.hive.hook.HiveHook.normalize(HiveHook.java:554) > at > org.apache.atlas.hive.hook.HiveHook.getProcessReferenceable(HiveHook.java:702) > at > org.apache.atlas.hive.hook.HiveHook.registerProcess(HiveHook.java:596) > at org.apache.atlas.hive.hook.HiveHook.fireAndForget(HiveHook.java:222) > at org.apache.atlas.hive.hook.HiveHook.access$200(HiveHook.java:77) > at org.apache.atlas.hive.hook.HiveHook$2.run(HiveHook.java:182) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-06-15 11:34:30,423 ERROR [Atlas Logger 0]: hook.HiveHook > (HiveHook.java:run(184)) - Atlas hook failed due to error > java.lang.NullPointerException > at java.lang.StringBuilder.(StringBuilder.java:109) >
Review Request 48939: ATLAS-904 Handle process qualified name per Hive Operation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/48939/ --- Review request for atlas, Shwetha GS and Hemanth Yamijala. Repository: atlas Description --- 1. Process qualified name = HiveOperation.name + sorted inputs + sorted outputs 2. HiveOperation.name doesnt provide identifiers for identiifying INSERT, INSERT_OVERWRITE, UPDATE, DELETE etc separately . Hence adding WriteEntity.WriteType as well which exhibits the following behaviour a. If there are multiple outputs, for each output, adds the query type(WriteType) b. if query being run if is type INSERT [into/overwrite] TABLE [PARTITION], WriteType is INSERT/INSERT_OVERWRITE b. If query is of type INSERT OVERWRITE hdfs_path, adds WriteType as PATH_WRITE c. If query is of type UPDATE/DELETE, adds type as UPDATE/DELETE [ Note - linage is not available for this since this is single table operation] 3.When input is of type local dir or hdfs path currently, it doesnt add it to qualified name. The reason is that partition based paths cause a lot of processes to be created in this case instead of updating the same process. Pending: Address Shwetha G S suggestion to add hdfs paths to process qualified name only in case of non-partition based queries. This needs to be done per HiveOperation type 1. if HiveOperation = LOAD, IMPORT, EXPORT - detect if the current query context is dealing with partitions and do not add if it is partition based. 2. If HiveOperation = INSERT OVERWRITE DFS_PATH/LOCAL_PATH , then detect if the query context is dealing with a partitioned table in inputs and decide if we need to add or not. Diffs - addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java c956a32 addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 23c82df addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java e7fbf71 webapp/src/main/java/org/apache/atlas/web/resources/EntityResource.java 0713d30 Diff: https://reviews.apache.org/r/48939/diff/ Testing --- Existing tests modified to query with new qualified name. Need to add tests for INSERT INTO TABLE Thanks, Suma Shivaprasad
[jira] [Commented] (ATLAS-904) Hive hook fails due to session state not being set
[ https://issues.apache.org/jira/browse/ATLAS-904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338961#comment-15338961 ] Suma Shivaprasad commented on ATLAS-904: https://reviews.apache.org/r/48939 > Hive hook fails due to session state not being set > -- > > Key: ATLAS-904 > URL: https://issues.apache.org/jira/browse/ATLAS-904 > Project: Atlas > Issue Type: Bug >Affects Versions: 0.7-incubating >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Blocker > Fix For: 0.7-incubating > > Attachments: ATLAS-904.1.patch, ATLAS-904.2.patch, ATLAS-904.patch > > > {noformat} > 2016-06-15 11:34:30,423 WARN [Atlas Logger 0]: hook.HiveHook > (HiveHook.java:normalize(557)) - Could not rewrite query due to error. > Proceeding with original query EXPORT TABLE test_export_table to > 'hdfs://localhost:9000/hive_tables/test_path1' > java.lang.NullPointerException: Conf non-local session path expected to be > non-null > at > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204) > at > org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:641) > at org.apache.hadoop.hive.ql.Context.(Context.java:133) > at org.apache.hadoop.hive.ql.Context.(Context.java:120) > at > org.apache.atlas.hive.rewrite.HiveASTRewriter.(HiveASTRewriter.java:44) > at org.apache.atlas.hive.hook.HiveHook.normalize(HiveHook.java:554) > at > org.apache.atlas.hive.hook.HiveHook.getProcessReferenceable(HiveHook.java:702) > at > org.apache.atlas.hive.hook.HiveHook.registerProcess(HiveHook.java:596) > at org.apache.atlas.hive.hook.HiveHook.fireAndForget(HiveHook.java:222) > at org.apache.atlas.hive.hook.HiveHook.access$200(HiveHook.java:77) > at org.apache.atlas.hive.hook.HiveHook$2.run(HiveHook.java:182) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-06-15 11:34:30,423 ERROR [Atlas Logger 0]: hook.HiveHook > (HiveHook.java:run(184)) - Atlas hook failed due to error > java.lang.NullPointerException > at java.lang.StringBuilder.(StringBuilder.java:109) > at > org.apache.atlas.hive.hook.HiveHook.getProcessQualifiedName(HiveHook.java:738) > at > org.apache.atlas.hive.hook.HiveHook.getProcessReferenceable(HiveHook.java:703) > at > org.apache.atlas.hive.hook.HiveHook.registerProcess(HiveHook.java:596) > at org.apache.atlas.hive.hook.HiveHook.fireAndForget(HiveHook.java:222) > at org.apache.atlas.hive.hook.HiveHook.access$200(HiveHook.java:77) > at org.apache.atlas.hive.hook.HiveHook$2.run(HiveHook.java:182) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 48939: ATLAS-904 Handle process qualified name per Hive Operation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/48939/ --- (Updated June 20, 2016, 4 a.m.) Review request for atlas, Shwetha GS and Hemanth Yamijala. Bugs: ATLAS-904 https://issues.apache.org/jira/browse/ATLAS-904 Repository: atlas Description --- 1. Process qualified name = HiveOperation.name + sorted inputs + sorted outputs 2. HiveOperation.name doesnt provide identifiers for identiifying INSERT, INSERT_OVERWRITE, UPDATE, DELETE etc separately . Hence adding WriteEntity.WriteType as well which exhibits the following behaviour a. If there are multiple outputs, for each output, adds the query type(WriteType) b. if query being run if is type INSERT [into/overwrite] TABLE [PARTITION], WriteType is INSERT/INSERT_OVERWRITE b. If query is of type INSERT OVERWRITE hdfs_path, adds WriteType as PATH_WRITE c. If query is of type UPDATE/DELETE, adds type as UPDATE/DELETE [ Note - linage is not available for this since this is single table operation] 3.When input is of type local dir or hdfs path currently, it doesnt add it to qualified name. The reason is that partition based paths cause a lot of processes to be created in this case instead of updating the same process. Pending: Address Shwetha G S suggestion to add hdfs paths to process qualified name only in case of non-partition based queries. This needs to be done per HiveOperation type 1. if HiveOperation = LOAD, IMPORT, EXPORT - detect if the current query context is dealing with partitions and do not add if it is partition based. 2. If HiveOperation = INSERT OVERWRITE DFS_PATH/LOCAL_PATH , then detect if the query context is dealing with a partitioned table in inputs and decide if we need to add or not. Diffs - addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java c956a32 addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 23c82df addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java e7fbf71 webapp/src/main/java/org/apache/atlas/web/resources/EntityResource.java 0713d30 Diff: https://reviews.apache.org/r/48939/diff/ Testing --- Existing tests modified to query with new qualified name. Need to add tests for INSERT INTO TABLE Thanks, Suma Shivaprasad
[jira] [Updated] (ATLAS-904) Hive hook fails due to session state not being set
[ https://issues.apache.org/jira/browse/ATLAS-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suma Shivaprasad updated ATLAS-904: --- Attachment: (was: ATLAS-904.2.patch) > Hive hook fails due to session state not being set > -- > > Key: ATLAS-904 > URL: https://issues.apache.org/jira/browse/ATLAS-904 > Project: Atlas > Issue Type: Bug >Affects Versions: 0.7-incubating >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Blocker > Fix For: 0.7-incubating > > Attachments: ATLAS-904.1.patch, ATLAS-904.patch > > > {noformat} > 2016-06-15 11:34:30,423 WARN [Atlas Logger 0]: hook.HiveHook > (HiveHook.java:normalize(557)) - Could not rewrite query due to error. > Proceeding with original query EXPORT TABLE test_export_table to > 'hdfs://localhost:9000/hive_tables/test_path1' > java.lang.NullPointerException: Conf non-local session path expected to be > non-null > at > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204) > at > org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:641) > at org.apache.hadoop.hive.ql.Context.(Context.java:133) > at org.apache.hadoop.hive.ql.Context.(Context.java:120) > at > org.apache.atlas.hive.rewrite.HiveASTRewriter.(HiveASTRewriter.java:44) > at org.apache.atlas.hive.hook.HiveHook.normalize(HiveHook.java:554) > at > org.apache.atlas.hive.hook.HiveHook.getProcessReferenceable(HiveHook.java:702) > at > org.apache.atlas.hive.hook.HiveHook.registerProcess(HiveHook.java:596) > at org.apache.atlas.hive.hook.HiveHook.fireAndForget(HiveHook.java:222) > at org.apache.atlas.hive.hook.HiveHook.access$200(HiveHook.java:77) > at org.apache.atlas.hive.hook.HiveHook$2.run(HiveHook.java:182) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-06-15 11:34:30,423 ERROR [Atlas Logger 0]: hook.HiveHook > (HiveHook.java:run(184)) - Atlas hook failed due to error > java.lang.NullPointerException > at java.lang.StringBuilder.(StringBuilder.java:109) > at > org.apache.atlas.hive.hook.HiveHook.getProcessQualifiedName(HiveHook.java:738) > at > org.apache.atlas.hive.hook.HiveHook.getProcessReferenceable(HiveHook.java:703) > at > org.apache.atlas.hive.hook.HiveHook.registerProcess(HiveHook.java:596) > at org.apache.atlas.hive.hook.HiveHook.fireAndForget(HiveHook.java:222) > at org.apache.atlas.hive.hook.HiveHook.access$200(HiveHook.java:77) > at org.apache.atlas.hive.hook.HiveHook$2.run(HiveHook.java:182) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 48939: ATLAS-904 Handle process qualified name per Hive Operation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/48939/ --- (Updated June 20, 2016, 4 a.m.) Review request for atlas, Shwetha GS and Hemanth Yamijala. Bugs: ATLAS-904 https://issues.apache.org/jira/browse/ATLAS-904 Repository: atlas Description --- 1. Process qualified name = HiveOperation.name + sorted inputs + sorted outputs 2. HiveOperation.name doesnt provide identifiers for identiifying INSERT, INSERT_OVERWRITE, UPDATE, DELETE etc separately . Hence adding WriteEntity.WriteType as well which exhibits the following behaviour a. If there are multiple outputs, for each output, adds the query type(WriteType) b. if query being run if is type INSERT [into/overwrite] TABLE [PARTITION], WriteType is INSERT/INSERT_OVERWRITE b. If query is of type INSERT OVERWRITE hdfs_path, adds WriteType as PATH_WRITE c. If query is of type UPDATE/DELETE, adds type as UPDATE/DELETE [ Note - linage is not available for this since this is single table operation] 3.When input is of type local dir or hdfs path currently, it doesnt add it to qualified name. The reason is that partition based paths cause a lot of processes to be created in this case instead of updating the same process. Pending: Address Shwetha G S suggestion to add hdfs paths to process qualified name only in case of non-partition based queries. This needs to be done per HiveOperation type 1. if HiveOperation = LOAD, IMPORT, EXPORT - detect if the current query context is dealing with partitions and do not add if it is partition based. 2. If HiveOperation = INSERT OVERWRITE DFS_PATH/LOCAL_PATH , then detect if the query context is dealing with a partitioned table in inputs and decide if we need to add or not. Diffs (updated) - addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java c956a32 addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 23c82df addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java e7fbf71 webapp/src/main/java/org/apache/atlas/web/resources/EntityResource.java 0713d30 Diff: https://reviews.apache.org/r/48939/diff/ Testing --- Existing tests modified to query with new qualified name. Need to add tests for INSERT INTO TABLE Thanks, Suma Shivaprasad
[jira] [Updated] (ATLAS-904) Hive hook fails due to session state not being set
[ https://issues.apache.org/jira/browse/ATLAS-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suma Shivaprasad updated ATLAS-904: --- Attachment: ATLAS-904.2.patch > Hive hook fails due to session state not being set > -- > > Key: ATLAS-904 > URL: https://issues.apache.org/jira/browse/ATLAS-904 > Project: Atlas > Issue Type: Bug >Affects Versions: 0.7-incubating >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Blocker > Fix For: 0.7-incubating > > Attachments: ATLAS-904.1.patch, ATLAS-904.2.patch, ATLAS-904.patch > > > {noformat} > 2016-06-15 11:34:30,423 WARN [Atlas Logger 0]: hook.HiveHook > (HiveHook.java:normalize(557)) - Could not rewrite query due to error. > Proceeding with original query EXPORT TABLE test_export_table to > 'hdfs://localhost:9000/hive_tables/test_path1' > java.lang.NullPointerException: Conf non-local session path expected to be > non-null > at > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204) > at > org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:641) > at org.apache.hadoop.hive.ql.Context.(Context.java:133) > at org.apache.hadoop.hive.ql.Context.(Context.java:120) > at > org.apache.atlas.hive.rewrite.HiveASTRewriter.(HiveASTRewriter.java:44) > at org.apache.atlas.hive.hook.HiveHook.normalize(HiveHook.java:554) > at > org.apache.atlas.hive.hook.HiveHook.getProcessReferenceable(HiveHook.java:702) > at > org.apache.atlas.hive.hook.HiveHook.registerProcess(HiveHook.java:596) > at org.apache.atlas.hive.hook.HiveHook.fireAndForget(HiveHook.java:222) > at org.apache.atlas.hive.hook.HiveHook.access$200(HiveHook.java:77) > at org.apache.atlas.hive.hook.HiveHook$2.run(HiveHook.java:182) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-06-15 11:34:30,423 ERROR [Atlas Logger 0]: hook.HiveHook > (HiveHook.java:run(184)) - Atlas hook failed due to error > java.lang.NullPointerException > at java.lang.StringBuilder.(StringBuilder.java:109) > at > org.apache.atlas.hive.hook.HiveHook.getProcessQualifiedName(HiveHook.java:738) > at > org.apache.atlas.hive.hook.HiveHook.getProcessReferenceable(HiveHook.java:703) > at > org.apache.atlas.hive.hook.HiveHook.registerProcess(HiveHook.java:596) > at org.apache.atlas.hive.hook.HiveHook.fireAndForget(HiveHook.java:222) > at org.apache.atlas.hive.hook.HiveHook.access$200(HiveHook.java:77) > at org.apache.atlas.hive.hook.HiveHook$2.run(HiveHook.java:182) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-904) Hive hook fails due to session state not being set
[ https://issues.apache.org/jira/browse/ATLAS-904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338991#comment-15338991 ] ATLAS QA commented on ATLAS-904: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12811744/ATLAS-904.2.patch against master revision 436a524. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. +1 checkstyle. The patch generated 0 code style errors. {color:red}-1 findbugs{color}. The patch appears to introduce 379 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.atlas.repository.typestore.GraphBackedTypeStoreTest Test results: https://builds.apache.org/job/PreCommit-ATLAS-Build/325//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ATLAS-Build/325//artifact/patchprocess/newPatchFindbugsWarningswebapp.html Findbugs warnings: https://builds.apache.org/job/PreCommit-ATLAS-Build/325//artifact/patchprocess/newPatchFindbugsWarningsauthorization.html Findbugs warnings: https://builds.apache.org/job/PreCommit-ATLAS-Build/325//artifact/patchprocess/newPatchFindbugsWarningscommon.html Findbugs warnings: https://builds.apache.org/job/PreCommit-ATLAS-Build/325//artifact/patchprocess/newPatchFindbugsWarningssqoop-bridge.html Findbugs warnings: https://builds.apache.org/job/PreCommit-ATLAS-Build/325//artifact/patchprocess/newPatchFindbugsWarningshdfs-model.html Findbugs warnings: https://builds.apache.org/job/PreCommit-ATLAS-Build/325//artifact/patchprocess/newPatchFindbugsWarningsstorm-bridge.html Findbugs warnings: https://builds.apache.org/job/PreCommit-ATLAS-Build/325//artifact/patchprocess/newPatchFindbugsWarningsfalcon-bridge.html Findbugs warnings: https://builds.apache.org/job/PreCommit-ATLAS-Build/325//artifact/patchprocess/newPatchFindbugsWarningshive-bridge.html Findbugs warnings: https://builds.apache.org/job/PreCommit-ATLAS-Build/325//artifact/patchprocess/newPatchFindbugsWarningsrepository.html Findbugs warnings: https://builds.apache.org/job/PreCommit-ATLAS-Build/325//artifact/patchprocess/newPatchFindbugsWarningstypesystem.html Findbugs warnings: https://builds.apache.org/job/PreCommit-ATLAS-Build/325//artifact/patchprocess/newPatchFindbugsWarningscatalog.html Findbugs warnings: https://builds.apache.org/job/PreCommit-ATLAS-Build/325//artifact/patchprocess/newPatchFindbugsWarningsclient.html Findbugs warnings: https://builds.apache.org/job/PreCommit-ATLAS-Build/325//artifact/patchprocess/newPatchFindbugsWarningsnotification.html Findbugs warnings: https://builds.apache.org/job/PreCommit-ATLAS-Build/325//artifact/patchprocess/newPatchFindbugsWarningstitan.html Console output: https://builds.apache.org/job/PreCommit-ATLAS-Build/325//console This message is automatically generated. > Hive hook fails due to session state not being set > -- > > Key: ATLAS-904 > URL: https://issues.apache.org/jira/browse/ATLAS-904 > Project: Atlas > Issue Type: Bug >Affects Versions: 0.7-incubating >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Blocker > Fix For: 0.7-incubating > > Attachments: ATLAS-904.1.patch, ATLAS-904.2.patch, ATLAS-904.patch > > > {noformat} > 2016-06-15 11:34:30,423 WARN [Atlas Logger 0]: hook.HiveHook > (HiveHook.java:normalize(557)) - Could not rewrite query due to error. > Proceeding with original query EXPORT TABLE test_export_table to > 'hdfs://localhost:9000/hive_tables/test_path1' > java.lang.NullPointerException: Conf non-local session path expected to be > non-null > at > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204) > at > org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:641) > at org.apache.hadoop.hive.ql.Context.(Context.java:133) > at org.apache.hadoop.hive.ql.Context.(Context.java:120) > at > org.apache.atlas.hive.rewrite.HiveASTRewriter.(HiveASTRewriter.java:44) > at org.apache.atlas.hive.hook.HiveHook.normalize(HiveHook.java:554) > at > org.apache.atlas.hive.hook.HiveHook.getProcessReferenceable(HiveHook.java:702) > at > org.apache.atlas.hive.hook.HiveHook.registerProcess(HiveHook.java:596) > at