> On Dec. 14, 2014, 9:06 a.m., Jarek Cecho wrote: > > connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormat.java, > > lines 70-72 > > <https://reviews.apache.org/r/29025/diff/3/?file=791537#file791537line70> > > > > I'm wondering what is motivation for this change? It seems to me that > > using T is much better then generic object. > > Veena Basavaraj wrote: > is not it obvious? that the actual object can be anything other than > string > > Veena Basavaraj wrote: > may be it is not obvious. Let me shed some more light. :) > > In > > private class SqoopMapDataWriter extends DataWriter { > > we need to do > > @Override > public void writeRecord(Object obj) { > fromIDF.setData(obj); > writeContent(); > } > > instead of the current do > @Override > public void writeRecord(Object obj) { > fromIDF.setData(obj.toString); > writeContent(); > } > > Jarek Cecho wrote: > I agree that the writeRecord(Object) should be calling setData(obj) > rather then doing the string conversion. So if I understand it correctly, > this change is then not needed correct?
the change exists because we need to make this work, how can T be recognised? Try the code yourself out, apply the patch and suggest if it a better way here > On Dec. 14, 2014, 9:06 a.m., Jarek Cecho wrote: > > execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java, > > line 34 > > <https://reviews.apache.org/r/29025/diff/3/?file=791543#file791543line34> > > > > I'm wondering here. The SqoopWritable can (and probably will) be used > > in both "From" and "To" context rigth? So we should probably load the > > correct IDF and not assume that it's "to". > > Veena Basavaraj wrote: > when it is we can get there, at this point it is used as the writable for > the loader and I rather make it clear that it is then forward thinking for > reasons it might not even be relevant now > > Veena Basavaraj wrote: > I even thought of renaming it to the ToWritable, but I know some people > on this project consider renames as the only think I do, so I wanted to spare > that discussion in the RB > > Jarek Cecho wrote: > I believe that currently the same writable is used for both From and To > and hence my comment. E.g extrated rows will get populated on mapper side > with "From" IDF, right? It seems to me as separating the IDFs for from/to is > actually slightly more complicated as we have to be careful what are each > parts of the workflow expecting. There will be only one Writable between > Mapper/Reducer(OutputFormat) and I would assume that the one writable will be > written with "From" context and will be read by "To" context. It seems to me > that we need to put somewhere the logic that do the conversion from "From" to > "To", so that we end up having only one writable with one IDF. Or are we > doing it already and I've just missed it? 5 line comment made me a bit dizzy. If you have not followed the code closely, I suggest looking thw the Sqoop Mapper and the DataWriter there, at first when I saw this code it left me dizzy too, so I have added comments in the mapper private void writeContent() { try { if (LOG.isDebugEnabled()) { LOG.debug("Extracted data: " + fromIDF.getCSVTextData()); } // NOTE: The fromIDF and the corresponding fromSchema is used only for the matching process // The output of the mappers is finally written to the toIDF object after the matching process // since the writable encapsulates the toIDF ==> new SqoopWritable(toIDF) toIDF.setObjectData(matcher.getMatchingData(fromIDF.getObjectData())); // NOTE: We do not use the reducer to do the writing (a.k.a LOAD in ETL). Hence the mapper sets up the writable context.write(writable, NullWritable.get()); } catch (Exception e) { throw new SqoopException(MRExecutionError.MAPRED_EXEC_0013, e); } } } The current code does not use this class for anything but the loader/ TO part. So if you propose reusing this, please create a ticket for that. > On Dec. 14, 2014, 9:06 a.m., Jarek Cecho wrote: > > connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormat.java, > > line 322 > > <https://reviews.apache.org/r/29025/diff/3/?file=791536#file791536line322> > > > > I'm wondering what is motivation for this change? I'm assuming that we > > want to allow comparing to any existing IDF (albeit comparing differet IDFs > > will always throw an exception), so why to restrict the usage only for > > those who are using String as their internal object? > > Veena Basavaraj wrote: > see below, there is strong typing for a reason. > > Veena Basavaraj wrote: > even better would be to change this to the actual CSVInter.... > > Veena Basavaraj wrote: > and also why do we need the compareTo? cant the default work? > > > public int compareTo(CSVIntermediateDataFormat o) { > if (this == o) { > return 0; > } > if (this.equals(o)) { > return 0; > } > return data.compareTo(o.getCSVTextData()); > } > > Jarek Cecho wrote: > It seems that we are not using the compareTo() method anywhere in the > code, so both current way and compareTo(CSVIntermediateDataFormat o) works > for me. The middle way with IntermediateDataFormat<String> seems strange > though. I suggest removing dead code and it seems superflous when thre is no extra logic in there - Veena ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29025/#review65049 ----------------------------------------------------------- On Dec. 14, 2014, 10:42 a.m., Veena Basavaraj wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/29025/ > ----------------------------------------------------------- > > (Updated Dec. 14, 2014, 10:42 a.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-1882 > https://issues.apache.org/jira/browse/SQOOP-1882 > > > Repository: sqoop-sqoop2 > > > Description > ------- > > see JIRA > > It fixes few things > > JobManager currently ignores the TO connector IDF. > Sqoop Mapper IDF cannot assume the type to be String > Using Set for jars than list, so we can avoid duplicates > > > Diffs > ----- > > > connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormat.java > dbe193d > > connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormat.java > 93698a8 > > connector/connector-sdk/src/test/java/org/apache/sqoop/connector/idf/TestCSVIntermediateDataFormat.java > 83a95ec > core/src/main/java/org/apache/sqoop/driver/JobManager.java 01073d4 > core/src/main/java/org/apache/sqoop/driver/JobRequest.java 8c1cc95 > > execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java > 9b3eb44 > execution/mapreduce/src/main/java/org/apache/sqoop/job/MRJobConstants.java > 67021a8 > > execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java > 336ab97 > execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java > 7434243 > > execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java > d664337 > execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java > 256c34d > execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMatching.java > 67c8525 > > execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java > d897125 > > Diff: https://reviews.apache.org/r/29025/diff/ > > > Testing > ------- > > yes > > > Thanks, > > Veena Basavaraj > >