[
https://issues.apache.org/jira/browse/METRON-965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016801#comment-16016801
]
ASF GitHub Bot commented on METRON-965:
---------------------------------------
Github user ottobackwards commented on a diff in the pull request:
https://github.com/apache/metron/pull/596#discussion_r117392109
--- Diff:
metron-platform/metron-writer/src/main/java/org/apache/metron/writer/hdfs/HdfsWriter.java
---
@@ -85,7 +85,17 @@ public void init(Map stormConfig, TopologyContext
topologyContext, WriterConfigu
this.fileNameFormat.prepare(stormConfig,topologyContext);
if(syncPolicy != null) {
//if the user has specified the sync policy, we don't want to
override their wishes.
- syncPolicyCreator = (source,config) -> syncPolicy;
+ syncPolicyCreator = (source,config) -> {
+ try {
--- End diff --
This seems a little black magic-y. Can we expand the comment as to why we
need to clone?
> In the case where we specify the syncpolicy in the HDFS Writer, we do not
> properly clone and end up syncing for every record
> ----------------------------------------------------------------------------------------------------------------------------
>
> Key: METRON-965
> URL: https://issues.apache.org/jira/browse/METRON-965
> Project: Metron
> Issue Type: Bug
> Reporter: Casey Stella
>
> Right now the SyncPolicy works as follows:
> * If you specify the sync policy in the flux file, it will use that policy to
> determine when to sync the HDFS Writer.
> * If you do not, then it will create a count sync policy based on the batch
> size.
> In the first case, we are not creating the sync policy properly and end up
> sync'ing per record, which has performance impact and is incorrect.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)