samza git commit: SAMZA-1623: include avro as the file suffix for hdfs producer

xinyu Wed, 21 Mar 2018 13:13:40 -0700

Repository: samza
Updated Branches:
  refs/heads/master 30c6a89b3 -> 2d7b0f52c



SAMZA-1623: include avro as the file suffix for hdfs producer

AvroDataFileHdfsWriter should include avro as the file suffix as some pig jobs 
couldn't read the avro files if they don't come with the proper suffix

Author: Hai Lu <[email protected]>

Reviewers: Xinyu Liu <[email protected]>

Closes #452 from lhaiesp/master


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/2d7b0f52
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/2d7b0f52
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/2d7b0f52

Branch: refs/heads/master
Commit: 2d7b0f52c316f0007c1217dc29f447b2136eefe3
Parents: 30c6a89
Author: Hai Lu <[email protected]>
Authored: Wed Mar 21 13:13:16 2018 -0700
Committer: xiliu <[email protected]>
Committed: Wed Mar 21 13:13:16 2018 -0700

----------------------------------------------------------------------
 .../apache/samza/system/hdfs/writer/AvroDataFileHdfsWriter.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/2d7b0f52/samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/AvroDataFileHdfsWriter.scala
----------------------------------------------------------------------
diff --git 
a/samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/AvroDataFileHdfsWriter.scala
 
b/samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/AvroDataFileHdfsWriter.scala
index f1868cd..a9823f5 100644
--- 
a/samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/AvroDataFileHdfsWriter.scala
+++ 
b/samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/AvroDataFileHdfsWriter.scala
@@ -66,7 +66,7 @@ class AvroDataFileHdfsWriter (dfs: FileSystem, systemName: 
String, config: HdfsC
   }
 
   protected def getNextWriter(record: Object): Option[DataFileWriter[Object]] 
= {
-    val path = bucketer.get.getNextWritePath(dfs)
+    val path = bucketer.get.getNextWritePath(dfs).suffix(".avro")
     val isGenericRecord = record.isInstanceOf[GenericRecord]
     val schema = record match {
       case genericRecord: GenericRecord => genericRecord.getSchema

samza git commit: SAMZA-1623: include avro as the file suffix for hdfs producer

Reply via email to