> On March 17, 2014, 12:50 a.m., Prashant Kommireddi wrote: > > bin/readme.txt, line 63 > > <https://reviews.apache.org/r/19257/diff/1/?file=520481#file520481line63> > > > > Is this always a temporary position? > > Gabriel Reid wrote: > Yes, it's (currently) always temporary. The HFiles that are written as > part of the import process (but thrown away after they've been added to > HBase) are written to a random directory under /tmp (in HDFS) by default, but > depending on the setup of HDFS and the existence of directory permissions on > it, it may be necessary to specify a different temp path to use (e.g. > something under the user directory). Ideally this parameter wouldn't need to > exist at all, but I think it is needed. > > James also mentioned a potential use case of writing HFiles for a table > that doesn't exist -- I can see this being useful if you've got a separate MR > cluster from your HBase cluster, and want to build up the HFiles quickly > without putting a load on your HBase cluster. That's not a use case that's > currently supported, but could become useful in the future.
Yep I remember having read about the latter in which case the output directory might not be considered temp. Since that's not supported currently, not something to worry about now. > On March 17, 2014, 12:50 a.m., Prashant Kommireddi wrote: > > phoenix-core/src/it/java/org/apache/phoenix/mapreduce/CsvBulkLoadToolIT.java, > > line 42 > > <https://reviews.apache.org/r/19257/diff/1/?file=520485#file520485line42> > > > > Should this be named *Test, say CsvBulkLoadToolTest ? > > Gabriel Reid wrote: > No, as of PHOENIX-130 all slow-running integration tests are run by > failsafe, under a separate source root. The "*IT.java" naming is used to help > failsafe to distinguish them from fast-running unit tests. Good to know, I wasn't aware. I'm learning a bit about Phoenix :) > On March 17, 2014, 12:50 a.m., Prashant Kommireddi wrote: > > phoenix-core/src/it/java/org/apache/phoenix/mapreduce/CsvBulkLoadToolIT.java, > > line 76 > > <https://reviews.apache.org/r/19257/diff/1/?file=520485#file520485line76> > > > > You might want to do this in a try-catch and do closing of > > stream/writer in a finally. > > > > Also, creating the tmp path would be better if its platform agnostic. > > See > > http://docs.oracle.com/javase/7/docs/api/java/io/File.html#createTempFile(java.lang.String, > > java.lang.String) > > Gabriel Reid wrote: > This is a file being created on HDFS (not local FS), so couldn't > platform-agnostic separators cause problems here? > > Closing it in a finally sounds like a good plan -- that being said, again > this is on a transient HDFS cluster within the test, so there is no garbage > that is going to be left behind after the test stops no matter what. I'm sorry, you are right this is writing to HDFS. Ignore my comment regarding using File API. Regarding closing streams, I'm less worried about the garbage and more about closing streams cleanly. Would the streams be released/closed at the end of tests regardless? > On March 17, 2014, 12:50 a.m., Prashant Kommireddi wrote: > > phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvBulkLoadTool.java, > > line 92 > > <https://reviews.apache.org/r/19257/diff/1/?file=520490#file520490line92> > > > > Could this be a private method? > > Gabriel Reid wrote: > No, it's used by the unit tests, so it needs to be package private. It > should be annotated with @VisibleForTesting though. Sounds good > On March 17, 2014, 12:50 a.m., Prashant Kommireddi wrote: > > phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvBulkLoadTool.java, > > line 182 > > <https://reviews.apache.org/r/19257/diff/1/?file=520490#file520490line182> > > > > Probably using "java.io.tmpdir" is better. > > Gabriel Reid wrote: > This is a path on HDFS, so java.io.tmpdir will point to something else. > The explanation on the approach for the creating of this tempdir is in a > comment above. Yep, my bad. - Prashant ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19257/#review37328 ----------------------------------------------------------- On March 15, 2014, 9:16 p.m., Gabriel Reid wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/19257/ > ----------------------------------------------------------- > > (Updated March 15, 2014, 9:16 p.m.) > > > Review request for phoenix. > > > Repository: phoenix > > > Description > ------- > > Rewrite of the Phoenix MapReduce import, to follow more standard MR tool > development patterns and use standard Phoenix CSV handling functionality. > > > Diffs > ----- > > bin/csv-bulk-loader.py 385ef41 > bin/readme.txt fa23eeb > phoenix-assembly/pom.xml 9a69cab > phoenix-assembly/src/build/all.xml 9c1bc41 > phoenix-assembly/src/build/mapreduce.xml PRE-CREATION > > phoenix-core/src/it/java/org/apache/phoenix/mapreduce/CsvBulkLoadToolIT.java > PRE-CREATION > phoenix-core/src/main/java/org/apache/phoenix/map/reduce/CSVBulkLoader.java > 0fd74e1 > phoenix-core/src/main/java/org/apache/phoenix/map/reduce/MapReduceJob.java > 9dc8032 > > phoenix-core/src/main/java/org/apache/phoenix/map/reduce/util/ConfigReader.java > 1c94d6f > > phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvBulkImportUtil.java > PRE-CREATION > > phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvBulkLoadTool.java > PRE-CREATION > > phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvToKeyValueMapper.java > PRE-CREATION > > phoenix-core/src/main/java/org/apache/phoenix/mapreduce/ImportPreUpsertKeyValueProcessor.java > PRE-CREATION > phoenix-core/src/main/java/org/apache/phoenix/schema/PDataType.java 34166d3 > phoenix-core/src/main/java/org/apache/phoenix/util/CSVCommonsLoader.java > 3e86477 > phoenix-core/src/main/java/org/apache/phoenix/util/ColumnInfo.java 37238c8 > > phoenix-core/src/test/java/org/apache/phoenix/mapreduce/CsvBulkImportUtilTest.java > PRE-CREATION > > phoenix-core/src/test/java/org/apache/phoenix/mapreduce/CsvBulkLoadToolTest.java > PRE-CREATION > > phoenix-core/src/test/java/org/apache/phoenix/mapreduce/CsvToKeyValueMapperTest.java > PRE-CREATION > phoenix-core/src/test/java/org/apache/phoenix/util/ColumnInfoTest.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/19257/diff/ > > > Testing > ------- > > > Thanks, > > Gabriel Reid > >
