[jira] [Commented] (HBASE-13625) Use HDFS for HFileOutputFormat2 partitioner's path

Stephen Yuan Jiang (JIRA) Wed, 06 May 2015 07:05:23 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-13625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14530593#comment-14530593
 ]


Stephen Yuan Jiang commented on HBASE-13625:
--------------------------------------------

[~adityakishore], if you are referring the Unit test failures, the issue was 
that some specific UT tried to access Local Filesystem (while the code expected 
DFS).  The V2 patch fixed the issue.  If you think in general that we should 
create parent, actually the code already done that if parent does not exist.



> Use HDFS for HFileOutputFormat2 partitioner's path
> --------------------------------------------------
>
>                 Key: HBASE-13625
>                 URL: https://issues.apache.org/jira/browse/HBASE-13625
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 2.0.0, 1.1.0, 1.2.0
>            Reporter: Stephen Yuan Jiang
>            Assignee: Stephen Yuan Jiang
>         Attachments: HBASE-13625-v2.patch, HBASE-13625.patch
>
>
> HBASE-13010 changed hard-coded '/tmp' in HFileOutputFormat2 partitioner's 
> path to 'hadoop.tmp.dir'.  This breaks unit test in Windows.
> {code}
>    static void configurePartitioner(Job job, List<ImmutableBytesWritable> 
> splitPoints)
>      ...
>      // create the partitions file
> -    FileSystem fs = FileSystem.get(job.getConfiguration());
> -    Path partitionsPath = new Path("/tmp", "partitions_" + 
> UUID.randomUUID());
> +    FileSystem fs = FileSystem.get(conf);
> +    Path partitionsPath = new Path(conf.get("hadoop.tmp.dir"), "partitions_" 
> + UUID.randomUUID());
> {code}
> Here is the exception from 1 of the UTs when running against Windows (from 
> branch-1.1) - The ':' is an invalid character in windows file path:
> {code}
> java.lang.IllegalArgumentException: Pathname 
> /C:/hbase-server/target/test-data/d25e2228-8959-43ee-b413-4fa69cdb8032/hadoop_tmp/partitions_fb96c0a0-41e6-4964-a391-738cb761ee3e
>  from 
> C:/hbase-server/target/test-data/d25e2228-8959-43ee-b413-4fa69cdb8032/hadoop_tmp/partitions_fb96c0a0-41e6-4964-a391-738cb761ee3e
>  is not a valid DFS filename.
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:197)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:106)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
>       at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:444)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
>       at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
>       at 
> org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:1074)
>       at 
> org.apache.hadoop.io.SequenceFile$RecordCompressWriter.<init>(SequenceFile.java:1374)
>       at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:275)
>       at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:297)
>       at 
> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.writePartitions(HFileOutputFormat2.java:335)
>       at 
> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configurePartitioner(HFileOutputFormat2.java:593)
>       at 
> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:440)
>       at 
> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:405)
>       at 
> org.apache.hadoop.hbase.mapreduce.ImportTsv.createSubmittableJob(ImportTsv.java:539)
>       at org.apache.hadoop.hbase.mapreduce.ImportTsv.run(ImportTsv.java:720)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>       at 
> org.apache.hadoop.hbase.mapreduce.TestImportTsv.doMROnTableTest(TestImportTsv.java:313)
>       at 
> org.apache.hadoop.hbase.mapreduce.TestImportTsv.testBulkOutputWithoutAnExistingTable(TestImportTsv.java:168)
> {code}
> The proposed fix is to use a config to point to a hdfs directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13625) Use HDFS for HFileOutputFormat2 partitioner's path

Reply via email to