[ 
https://issues.apache.org/jira/browse/SAMZA-371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14097987#comment-14097987
 ] 

Jakob Homan commented on SAMZA-371:
-----------------------------------

I agree the config name is long, but it's also pretty descriptive, and so I 
would suggest it shouldn't be changed.  
job.systemsteampartition.grouper.factory specifies a per-*job* *factory* that 
*groups* *SystemStreamPartitions*. The name is a direct function of what it 
does.  

The unwieldy part is the systemstreampartition, which I've suggested 
simplifying [before|https://issues.apache.org/jira/browse/SAMZA-216], but we 
couldn't reach consensus there.  And at this point, I agree it's worth the 
extra characters to be explicit clear in that class name at the cost of 
brevity.  The alternative would be something like Hadoop's FileSplit, which 
isn't always a file as it could be a database shard, a kafka topic-partition, 
etc. - people learn to work around this, but it's still a bit of a confusing 
point when they get started

job.system.stream.partition.grouper.factory doesn't work since it doesn't 
mirror the class and conflicts and confuses with our convention of system.<the 
system>.stream.<the stream>.foo.

bq. I agree that one could group by things other than partition, but the thing 
you're grouping will always be a partition (with a system/stream).
Sure, but the partition is coequal with the other two elements (system/stream), 
at least in the post-SAMZA-123 world. 'Twould be more accurate to say the thing 
you're grouping will always contain a partition, rather than be a partition.  

bq. We could go with the job.ssp.grouper.factory route. I've been trying to 
avoid exposing the "ssp" acronym to end-users since it's totally unclear what 
that is, though.
I agree. 

I'd prefer to keep the config as it is. It's not one most users will have to 
set since we have a default value for it and, when they do need to use 
something other than the default, it's a powerful enough config that  it's 
worth typing the whole thing out (which just needs to be done once).

> Rename job.systemstreampartition.grouper.factory
> ------------------------------------------------
>
>                 Key: SAMZA-371
>                 URL: https://issues.apache.org/jira/browse/SAMZA-371
>             Project: Samza
>          Issue Type: Bug
>          Components: container
>    Affects Versions: 0.8.0
>            Reporter: Chris Riccomini
>             Fix For: 0.8.0
>
>         Attachments: SAMZA-371-0.patch, SAMZA-371-1.patch
>
>
> The config name job.systemstreampartition.grouper.factory, introduced in 
> SAMZA-123, is a bit long and unwieldy. I think we should rename it. Some 
> proposals:
> # job.input.grouper.factory
> # job.partition.grouper.factory
> # job.task.grouper.factory
> # job.grouper.factory
> (1) is my preference.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to