> On July 30, 2015, 6:59 p.m., Yan Fang wrote: > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/JobNameDateTimeBucketer.scala, > > line 40 > > <https://reviews.apache.org/r/35445/diff/4/?file=1023371#file1023371line40> > > > > My overall concern here is that, if there are more than one tasks are > > running, is it possible that all the tasks are writing to one file at the > > same time? > > Eli Reisman wrote: > I don't think so, each registered source should be using it's own > HdfsWriter in write() calls even on the same Producer and the filenames per > writer are unique-ified in the writer impl. There are other ways to > accomplish that uniqueness though. > > Yan Fang wrote: > I see. We are using the UUID.randomUUID to make sure the writers writes > to different files. This is fine unless we win the lottery. :)
Right! I was wondering if you wanted something more concise or there's a better uniquing pattern using some combo of other fields available in the Config + systemName? Glad you brought it up - Eli ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/35445/#review93614 ----------------------------------------------------------- On July 28, 2015, 5:25 a.m., Eli Reisman wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/35445/ > ----------------------------------------------------------- > > (Updated July 28, 2015, 5:25 a.m.) > > > Review request for samza. > > > Repository: samza > > > Description > ------- > > SAMZA-693: Very basic HDFS Producer service for Samza > > > Diffs > ----- > > build.gradle 0852adc > docs/learn/documentation/versioned/hdfs/producer.md PRE-CREATION > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsConfig.scala > PRE-CREATION > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemAdmin.scala > PRE-CREATION > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemFactory.scala > PRE-CREATION > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemProducer.scala > PRE-CREATION > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemProducerMetrics.scala > PRE-CREATION > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/BinarySequenceFileHdfsWriter.scala > PRE-CREATION > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/Bucketer.scala > PRE-CREATION > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/HdfsWriter.scala > PRE-CREATION > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/JobNameDateTimeBucketer.scala > PRE-CREATION > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/SequenceFileHdfsWriter.scala > PRE-CREATION > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/TextSequenceFileHdfsWriter.scala > PRE-CREATION > samza-hdfs/src/test/resources/samza-hdfs-test-batch-job-text.properties > PRE-CREATION > samza-hdfs/src/test/resources/samza-hdfs-test-batch-job.properties > PRE-CREATION > samza-hdfs/src/test/resources/samza-hdfs-test-job-text.properties > PRE-CREATION > samza-hdfs/src/test/resources/samza-hdfs-test-job.properties PRE-CREATION > > samza-hdfs/src/test/scala/org/apache/samza/system/hdfs/TestHdfsSystemProducerTestSuite.scala > PRE-CREATION > settings.gradle 19bff97 > > Diff: https://reviews.apache.org/r/35445/diff/ > > > Testing > ------- > > Updated: See JIRA SAMZA-693 for details, this latest update (693-4) addresses > post-review issues and adds more pluggable design, several default writer > implementations, and more (and more thorough) unit tests. > > Passes 'gradle clean test'. > > > Thanks, > > Eli Reisman > >