Hi Charles,
unfortunately non-trivial joins might lead to an unexpected results and issues. 
One caveat is that Sqoop will run your expensive query in parallel which might 
lead to undesirable performance hit on the database side. One way how to 
overcome this issue is to run your expensive non-trivial query prior Sqoop 
import and store it's output as an table, for example in MySQL you can do

CREATE TABLE sqoop_tmp_table AS SELECT ... JOIN ... JOIN ... JOIN ... JOIN ... 
JOIN ... (query that you've used originally)

Jarcec

On Tue, Dec 18, 2012 at 12:26:06PM -0500, Charles Earl wrote:
> Hi,
> Are there any best practices or caveats for including nested joins in free 
> from query imports?
> I have noted that in the documentation it says "Use of complex queries such 
> as queries that have sub-queries or joins leading to ambiguous projections 
> can lead to unexpected results." I'm relatively new to the use of sqoop, have 
> not encountered any problems, but I imagine that multiple mapper imports 
> combine with complex joins might produce inconsistent results, as it seems 
> that the parallelism depends upon range partitioning based on the splitting 
> column. Or perhaps this is over thinking….
> 
> Charles 
> 
> 

Attachment: signature.asc
Description: Digital signature

Reply via email to