> On Dec. 2, 2017, 12:22 a.m., Sahil Takiar wrote:
> > Since we touch the `LoadSemanticAnalyzer` could we add a q-test (could be 
> > added to one of the existing `lineage*.q` files) for `LOAD` statements. 
> > Same for import / export statements (as far as I can tell there are no 
> > existing ones, correct me if I am wrong).
> > 
> > If you have time, it would be great to run some of the lineage tests for 
> > HoS too, but since thats a bit orthogonal to this JIRA, it can be done in a 
> > follow up JIRA.
> 
> Andrew Sherman wrote:
>     I will addsome more tests...

Update: I looked into his some more. It turns out that Lineage info is printed 
by default by the PostExecutePrinter.
So any .q tests should show their lineage in the output.
HoS tests do print some lineage so I think that part is covered.
But currently no lineage is printed for LOAD/IMPORT/EXPORT.
The MoveTask does call getLineageState().setLineage() to set lineage into the 
LineageState but it does not call getLineageState().mapDirToOp().
Possibly this is because LOAD statements don't have a 
org.apache.hadoop.hive.ql.exec.Operator
And IMPORT statements behave similarly.
IMO the whole Lineage stuff is not clearly specified. 
Do you think it is worth doing follow-up work to try to document /test) it 
better?


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64193/#review192601
-----------------------------------------------------------


On Nov. 30, 2017, 1:22 a.m., Andrew Sherman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64193/
> -----------------------------------------------------------
> 
> (Updated Nov. 30, 2017, 1:22 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> A Hive Session can contain multiple concurrent sql Operations.
> Lineage is currently tracked in SessionState and is cleared when a query
> completes. This results in Lineage for other running queries being lost.
> 
> To fix this, move LineageState from SessionState to QueryState.
> In MoveTask/MoveWork use the LineageState from the MoveTask's QueryState
> rather than trying to use it from MoveWork.
> Add a test which runs multiple jdbc queries in a thread pool
> against the same connection and show that Vertices are not lost from Lineage.
> As part of this test, add ReadableHook, an ExecuteWithHookContext that stores
> HookContexts in memory and makes them available for reading.
> Make LineageLogger methods static so they can be used elsewhere.
> 
> Sometimes a running query (originating in a Driver) will instantiate
> another Driver to run or compile another query. Because these Drivers
> shared a Session, the child Driver would accumulate Lineage information
> along with that of the parent Driver. For consistency a LineageState is
> passed to these child Drivers and stored in the new Driver's QueryState.
> 
> 
> Diffs
> -----
> 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> f5ed735c1ec14dfee338e56020fa2629b168389d 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
> af9f193dc94e2e05caa88d965a34f4483c9d7069 
>   ql/src/java/org/apache/hadoop/hive/ql/QueryState.java 
> 7d5aa8b179e536e25c41a8946e667f8dd5669e0f 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
> e7af5e004fb560b574b82f6d1b60517511802f37 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 
> e2f8c1f8012ad25114e279747e821b291c7f4ca6 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 
> 1f0487f4f72ab18bcf876f45ad5758d83a7f001b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
>  262225fc202d4627652acfd77350e44b0284b3da 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadTable.java
>  bb1f4e50509e57a9d0b9e6793c1fc08baa4d2981 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/HookContext.java 
> 7b617309f6b0d8a7ce0dea80ab1f790c2651b147 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/LineageLogger.java 
> 2f764f8a29a9d41a7db013a949ffe3a8a9417d32 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/ReadableHook.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java 
> 68709b4d3baf15d78e60e948ccdef3df84f28cec 
>   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 
> 1e577da82343a1b7361467fb662661f9c6642ec0 
>   ql/src/java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java 
> 29886ae7f97f8dae7116f4fc9a2417ab8f9dac0a 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
> 7b067a0d45e33bc3347c43b050af933c296a9227 
>   
> ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
> 504b0623142a6fa6cdb45a26b49f146e12ec2d7a 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java 
> d7a83f775abca39b219f71aff88173a14ffaee9f 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRProcContext.java 
> 4387c4297fee48d4c03e95d5a2fcb822ab480eeb 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 67739a1db9fc52a67f4f5ea7dba80fe0e95750c8 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java 
> 338c1856672f09bb7da35d2336ebb5b6f3fdc5a6 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/Generator.java 
> e6c07713b24df719315d804f006151106eea9aed 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> 1fd634c928a5384b09d97322c3ea785f518d73fe 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 
> 065c7e50986872cd35386feee712f3452597d643 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java 
> 0c160acf46eb1eb07c5f04091099c1024e166638 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 
> b6f1139fe1a78283277bf4d0c5224ab1d718c634 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> cd75130d7c5f0b402f1b4331c57edc611eb4b2ed 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IndexUpdater.java 
> f31775ed942160da73344c4dca707da7b8c658a6 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 238fbd60572ee5f7f8f6c4d5b2abce8f66c7e495 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java 
> d7a56e5846d5754dec5070d8c44443543a3695e4 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 
> 498b6741c3f40b92ce3fb218e91e7809a17383f0 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> b66817f65f65b6aaf8dbc339a969b8e9e0565e9e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java 
> 7b2937032ab8dd57f8923e0a9e7aab4a92de55ee 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 
> be33f380030ea8b416a4549c3947d767bba66356 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
> 4d2bcfa285dc08811106f3c234346efff22afd99 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
> 604c8aee151a45cf942852a3644b5e79f779f353 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
> 965044d9253585eeaeef50d7fe4fc4d818042df8 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/MoveWork.java 
> 28a33740b30b7be0057ce91de55a0407dd2f2cbf 
>   ql/src/java/org/apache/hadoop/hive/ql/session/LineageState.java 
> 056d6141d6239816699ed5f730cbd14e48d8d9bb 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 
> bb6ddc6fa4667ac0e30994d0f9ee8b969542383c 
>   
> ql/src/test/org/apache/hadoop/hive/ql/optimizer/TestGenMapRedUtilsCreateConditionalTask.java
>  340689255c738ea497bcd269463b8b8bc38cf34c 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestGenTezWork.java 
> 2c28c398ca49ba661df460c9f3e6d578c785d3ce 
> 
> 
> Diff: https://reviews.apache.org/r/64193/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Andrew Sherman
> 
>

Reply via email to