[
https://issues.apache.org/jira/browse/SAMZA-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14127704#comment-14127704
]
Roger Hoover commented on SAMZA-40:
-----------------------------------
Here are some things that come to mind for me but I haven't really though
through:
- What about a way to specify a DAG for the job? From the developer's point of
view, she mostly cares of the data flow. Maybe there could a pluggable naming
schema for topics in between jobs so that you don't have to explicitly name
them??? You'd want a nice way to specify this. YAML?? Using job-name:
wikipedia-feed
- wikipedia-parser
- wikipedia-stats
Ideally, that would be enough to wire everything together???
- Support a programatic, code-level API for building, validating and deploying
jobs? Hopefully, this would make it possible to build higher-level frameworks
on top that could dynamically generate jobs. I don't know if I'd ever want to
do this but if the API is there, you never know what will spring up.
- Support for validation during build and during runtime initialization to
catch errors early.
- Can sensible defaults make the config less verbose?
- What about on/off switches for things like metrics and checkpointing? If
don't specify otherwise, you get the default metrics package and Kafka
checkpointing.
> Refactor Samza configuration
> ----------------------------
>
> Key: SAMZA-40
> URL: https://issues.apache.org/jira/browse/SAMZA-40
> Project: Samza
> Issue Type: Bug
> Components: container
> Affects Versions: 0.6.0
> Reporter: Chris Riccomini
> Labels: project
>
> Samza's configuration system has several problems that we need to resolved.
> * Want to auto-generate documentation based off of configuration.
> * Should support global defaults for a config property. Right now, we do
> config.getFoo.getOrElse() everywhere.
> * Should validate config up front, rather than thrown runtime exceptions
> randomly throughout the code.
> * We are mixing wiring and configuration together. How do other systems
> handle this?
> * We have fragmented configuration (anybody can define configuration). How do
> other systems handle this?
> * How to handle undefined configuration? How to make this interoperable with
> both Java and Scala (i.e. should we support Option in Scala)?
> * Should remain immutable.
> * Should remove implicits. It's just confusing.
> * Do we want to support complex types (list, map) for values, not just String?
> We need a design proposal for this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)