> On July 29, 2015, 7:41 p.m., Navina Ramesh wrote:
> > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/TextSequenceFileHdfsWriter.scala,
> >  line 37
> > <https://reviews.apache.org/r/35445/diff/4/?file=1023373#file1023373line37>
> >
> >     What is the motivation for having 2 sequenceFileHdfsWriters?  Is the 
> > only difference in the data type being written to the fs? 
> >     
> >     The difference is not clear from the class's javadoc or the example 
> > configuration provided. The reason I ask is we already have a config 
> > overload in Samza. I would rather not have the user configure a writer 
> > type, if we can safely default to a single type and only override the 
> > relevant diverging properties. Functionally, I believe they are doing the 
> > same thing.
> 
> Eli Reisman wrote:
>     I have worked on another Apache project where introducing Writable type 
> params all over the place has painted us into some corners over time, so I 
> wanted to try it this way first. I could definitely rework the patch so that 
> the current base class SequenceFileHdfsWriter takes [K, V] type params and 
> handles data types that way if you like?
> 
> Eli Reisman wrote:
>     Oh and thanks once again for the prompt review, you folks are on top of 
> it!

I am liking the type param idea more as I think about it, despite past 
experience, but I am concerned that things like the casting of the outgoing 
message to something not-Writable like Array[Byte] or String might require a 
third param and it might start to get awkward. Also there are some Writable 
types that would not allow us to determine message size for batching purposes 
the way Text and BytesWritable do and could not be asbtracted this way, so we 
might end up with subclasses still (but smaller interfaces to implement) in 
that case.

Any thoughts or guidance is welcome here, I'm happy enough with it at this 
point to make whatever of these changes sounds good to you and the committers.


- Eli


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35445/#review93501
-----------------------------------------------------------


On July 28, 2015, 5:25 a.m., Eli Reisman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35445/
> -----------------------------------------------------------
> 
> (Updated July 28, 2015, 5:25 a.m.)
> 
> 
> Review request for samza.
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> SAMZA-693: Very basic HDFS Producer service for Samza
> 
> 
> Diffs
> -----
> 
>   build.gradle 0852adc 
>   docs/learn/documentation/versioned/hdfs/producer.md PRE-CREATION 
>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsConfig.scala 
> PRE-CREATION 
>   
> samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemAdmin.scala 
> PRE-CREATION 
>   
> samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemFactory.scala
>  PRE-CREATION 
>   
> samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemProducer.scala
>  PRE-CREATION 
>   
> samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemProducerMetrics.scala
>  PRE-CREATION 
>   
> samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/BinarySequenceFileHdfsWriter.scala
>  PRE-CREATION 
>   
> samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/Bucketer.scala 
> PRE-CREATION 
>   
> samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/HdfsWriter.scala
>  PRE-CREATION 
>   
> samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/JobNameDateTimeBucketer.scala
>  PRE-CREATION 
>   
> samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/SequenceFileHdfsWriter.scala
>  PRE-CREATION 
>   
> samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/TextSequenceFileHdfsWriter.scala
>  PRE-CREATION 
>   samza-hdfs/src/test/resources/samza-hdfs-test-batch-job-text.properties 
> PRE-CREATION 
>   samza-hdfs/src/test/resources/samza-hdfs-test-batch-job.properties 
> PRE-CREATION 
>   samza-hdfs/src/test/resources/samza-hdfs-test-job-text.properties 
> PRE-CREATION 
>   samza-hdfs/src/test/resources/samza-hdfs-test-job.properties PRE-CREATION 
>   
> samza-hdfs/src/test/scala/org/apache/samza/system/hdfs/TestHdfsSystemProducerTestSuite.scala
>  PRE-CREATION 
>   settings.gradle 19bff97 
> 
> Diff: https://reviews.apache.org/r/35445/diff/
> 
> 
> Testing
> -------
> 
> Updated: See JIRA SAMZA-693 for details, this latest update (693-4) addresses 
> post-review issues and adds more pluggable design, several default writer 
> implementations, and more (and more thorough) unit tests.
> 
> Passes 'gradle clean test'.
> 
> 
> Thanks,
> 
> Eli Reisman
> 
>

Reply via email to