[
https://issues.apache.org/jira/browse/SAMZA-371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14097987#comment-14097987
]
Jakob Homan commented on SAMZA-371:
-----------------------------------
I agree the config name is long, but it's also pretty descriptive, and so I
would suggest it shouldn't be changed.
job.systemsteampartition.grouper.factory specifies a per-*job* *factory* that
*groups* *SystemStreamPartitions*. The name is a direct function of what it
does.
The unwieldy part is the systemstreampartition, which I've suggested
simplifying [before|https://issues.apache.org/jira/browse/SAMZA-216], but we
couldn't reach consensus there. And at this point, I agree it's worth the
extra characters to be explicit clear in that class name at the cost of
brevity. The alternative would be something like Hadoop's FileSplit, which
isn't always a file as it could be a database shard, a kafka topic-partition,
etc. - people learn to work around this, but it's still a bit of a confusing
point when they get started
job.system.stream.partition.grouper.factory doesn't work since it doesn't
mirror the class and conflicts and confuses with our convention of system.<the
system>.stream.<the stream>.foo.
bq. I agree that one could group by things other than partition, but the thing
you're grouping will always be a partition (with a system/stream).
Sure, but the partition is coequal with the other two elements (system/stream),
at least in the post-SAMZA-123 world. 'Twould be more accurate to say the thing
you're grouping will always contain a partition, rather than be a partition.
bq. We could go with the job.ssp.grouper.factory route. I've been trying to
avoid exposing the "ssp" acronym to end-users since it's totally unclear what
that is, though.
I agree.
I'd prefer to keep the config as it is. It's not one most users will have to
set since we have a default value for it and, when they do need to use
something other than the default, it's a powerful enough config that it's
worth typing the whole thing out (which just needs to be done once).
> Rename job.systemstreampartition.grouper.factory
> ------------------------------------------------
>
> Key: SAMZA-371
> URL: https://issues.apache.org/jira/browse/SAMZA-371
> Project: Samza
> Issue Type: Bug
> Components: container
> Affects Versions: 0.8.0
> Reporter: Chris Riccomini
> Fix For: 0.8.0
>
> Attachments: SAMZA-371-0.patch, SAMZA-371-1.patch
>
>
> The config name job.systemstreampartition.grouper.factory, introduced in
> SAMZA-123, is a bit long and unwieldy. I think we should rename it. Some
> proposals:
> # job.input.grouper.factory
> # job.partition.grouper.factory
> # job.task.grouper.factory
> # job.grouper.factory
> (1) is my preference.
--
This message was sent by Atlassian JIRA
(v6.2#6252)