[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2017-01-27 Thread Joe Skora (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15843084#comment-15843084
 ] 

Joe Skora commented on NIFI-2854:
-

[~markap14], Is there any reason that 
[StandardRecordReader|https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/StandardRecordReader.java]
 was not deprecated by this change like 
[StandardRecordWriter|https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/StandardRecordWriter.java]?


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.
> The following also would be true:
> * Apache NiFi 1.0.0 repositories should work just fine when applied to an 
> Apache NiFi 1.1.0 installation.
> * Repositories made/updated in Apache NiFi 1.1.0 onward would not work in 
> older Apache NiFi releases (such as 1.0.0)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15677583#comment-15677583
 ] 

ASF subversion and git services commented on NIFI-2854:
---

Commit 1be08714731f01347ac1f98e18047fe7d9ab8afd in nifi's branch 
refs/heads/master from [~markap14]
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=1be0871 ]

NIFI-2854: Refactor repositories and swap files to use schema-based 
serialization so that nifi can be rolled back to a previous version after an 
upgrade.

NIFI-2854: Incorporated PR review feedback

NIFI-2854: Implemented feedback from PR Review

NIFI-2854: Ensure that all resources are closed on 
CompressableRecordReader.close() even if an IOException is thrown when closing 
one of them

This closes #1202


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15677580#comment-15677580
 ] 

ASF subversion and git services commented on NIFI-2854:
---

Commit 1be08714731f01347ac1f98e18047fe7d9ab8afd in nifi's branch 
refs/heads/master from [~markap14]
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=1be0871 ]

NIFI-2854: Refactor repositories and swap files to use schema-based 
serialization so that nifi can be rolled back to a previous version after an 
upgrade.

NIFI-2854: Incorporated PR review feedback

NIFI-2854: Implemented feedback from PR Review

NIFI-2854: Ensure that all resources are closed on 
CompressableRecordReader.close() even if an IOException is thrown when closing 
one of them

This closes #1202


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15677584#comment-15677584
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user asfgit closed the pull request at:

https://github.com/apache/nifi/pull/1202


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15677581#comment-15677581
 ] 

ASF subversion and git services commented on NIFI-2854:
---

Commit 1be08714731f01347ac1f98e18047fe7d9ab8afd in nifi's branch 
refs/heads/master from [~markap14]
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=1be0871 ]

NIFI-2854: Refactor repositories and swap files to use schema-based 
serialization so that nifi can be rolled back to a previous version after an 
upgrade.

NIFI-2854: Incorporated PR review feedback

NIFI-2854: Implemented feedback from PR Review

NIFI-2854: Ensure that all resources are closed on 
CompressableRecordReader.close() even if an IOException is thrown when closing 
one of them

This closes #1202


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15677582#comment-15677582
 ] 

ASF subversion and git services commented on NIFI-2854:
---

Commit 1be08714731f01347ac1f98e18047fe7d9ab8afd in nifi's branch 
refs/heads/master from [~markap14]
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=1be0871 ]

NIFI-2854: Refactor repositories and swap files to use schema-based 
serialization so that nifi can be rolled back to a previous version after an 
upgrade.

NIFI-2854: Incorporated PR review feedback

NIFI-2854: Implemented feedback from PR Review

NIFI-2854: Ensure that all resources are closed on 
CompressableRecordReader.close() even if an IOException is thrown when closing 
one of them

This closes #1202


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15677460#comment-15677460
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user olegz commented on the issue:

https://github.com/apache/nifi/pull/1202
  
LGTM, merging


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15677328#comment-15677328
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88708219
  
--- Diff: 
nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/serialization/CompressableRecordReader.java
 ---
@@ -0,0 +1,277 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.provenance.serialization;
+
+import java.io.DataInputStream;
+import java.io.EOFException;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.charset.StandardCharsets;
+import java.util.zip.GZIPInputStream;
+
+import org.apache.nifi.provenance.ProvenanceEventRecord;
+import org.apache.nifi.provenance.StandardProvenanceEventRecord;
+import org.apache.nifi.provenance.StandardRecordReader;
+import org.apache.nifi.provenance.toc.TocReader;
+import org.apache.nifi.stream.io.BufferedInputStream;
+import org.apache.nifi.stream.io.ByteCountingInputStream;
+import org.apache.nifi.stream.io.LimitingInputStream;
+import org.apache.nifi.stream.io.StreamUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class CompressableRecordReader implements RecordReader {
+private static final Logger logger = 
LoggerFactory.getLogger(StandardRecordReader.class);
+
+private final ByteCountingInputStream rawInputStream;
+private final String filename;
+private final int serializationVersion;
+private final boolean compressed;
+private final TocReader tocReader;
+private final int headerLength;
+private final int maxAttributeChars;
+
+private DataInputStream dis;
+private ByteCountingInputStream byteCountingIn;
+
+public CompressableRecordReader(final InputStream in, final String 
filename, final int maxAttributeChars) throws IOException {
+this(in, filename, null, maxAttributeChars);
+}
+
+public CompressableRecordReader(final InputStream in, final String 
filename, final TocReader tocReader, final int maxAttributeChars) throws 
IOException {
+logger.trace("Creating RecordReader for {}", filename);
+
+rawInputStream = new ByteCountingInputStream(in);
+this.maxAttributeChars = maxAttributeChars;
+
+final InputStream limitedStream;
+if (tocReader == null) {
+limitedStream = rawInputStream;
+} else {
+final long offset1 = tocReader.getBlockOffset(1);
+if (offset1 < 0) {
+limitedStream = rawInputStream;
+} else {
+limitedStream = new LimitingInputStream(rawInputStream, 
offset1 - rawInputStream.getBytesConsumed());
+}
+}
+
+final InputStream readableStream;
+if (filename.endsWith(".gz")) {
+readableStream = new BufferedInputStream(new 
GZIPInputStream(limitedStream));
+compressed = true;
+} else {
+readableStream = new BufferedInputStream(limitedStream);
+compressed = false;
+}
+
+byteCountingIn = new ByteCountingInputStream(readableStream);
+dis = new DataInputStream(byteCountingIn);
+
+final String repoClassName = dis.readUTF();
+final int serializationVersion = dis.readInt();
+headerLength = 
repoClassName.getBytes(StandardCharsets.UTF_8).length + 2 + 4; // 2 bytes for 
string length, 4 for integer.
+
+this.serializationVersion = serializationVersion;
+this.filename = filename;
+this.tocReader = tocReader;
+
+readHeader(dis, serializationVersion);
+  

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15677301#comment-15677301
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88706738
  
--- Diff: 
nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/AbstractRecordWriter.java
 ---
@@ -0,0 +1,173 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.provenance;
+
+import java.io.File;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.util.concurrent.locks.Lock;
+import java.util.concurrent.locks.ReentrantLock;
+
+import org.apache.nifi.provenance.serialization.RecordWriter;
+import org.apache.nifi.provenance.toc.TocWriter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class AbstractRecordWriter implements RecordWriter {
+private static final Logger logger = 
LoggerFactory.getLogger(AbstractRecordWriter.class);
+
+private final File file;
+private final TocWriter tocWriter;
+private final Lock lock = new ReentrantLock();
+
+private volatile boolean dirty = false;
+private volatile boolean closed = false;
+
+private int recordsWritten = 0;
+
+public AbstractRecordWriter(final File file, final TocWriter writer) 
throws IOException {
+logger.trace("Creating Record Writer for {}", file);
+
+this.file = file;
+this.tocWriter = writer;
+}
+
+@Override
+public synchronized void close() throws IOException {
+closed = true;
+
+logger.trace("Closing Record Writer for {}", file == null ? null : 
file.getName());
+
+lock();
--- End diff --

The 'synchronized' and the lock are there to protected two different 
things. The writer exposes an ability to lock it externally so that multiple 
operations (such as writeRecord, flush, etc) can be called atomically without 
anything else being written. The synchronized protects a few different member 
variables. Essentially, it's employing two completely disparate synchronization 
barriers in order to improve the throughput (no need to wait for a writer to 
finish writing many records and flush before returning the number of records 
written via getRecordsWritten() ).

I believe the code was more clear before I separated the writers out into 
abstract classes. As they are now, it is a bit confusing and perhaps is worth 
simply using the lock and not synchronizing for simplicity purposes. However, I 
would be very hesitant to refactor something like that now, as this ticket is 
blocking the 1.1.0 release, and I believe it is correct as-is. It is simply an 
abstraction of existing classes to produce more layers to avoid code repetition.


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15674384#comment-15674384
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user jtstorck commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88509784
  
--- Diff: 
nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/AbstractRecordWriter.java
 ---
@@ -0,0 +1,173 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.provenance;
+
+import java.io.File;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.util.concurrent.locks.Lock;
+import java.util.concurrent.locks.ReentrantLock;
+
+import org.apache.nifi.provenance.serialization.RecordWriter;
+import org.apache.nifi.provenance.toc.TocWriter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class AbstractRecordWriter implements RecordWriter {
+private static final Logger logger = 
LoggerFactory.getLogger(AbstractRecordWriter.class);
+
+private final File file;
+private final TocWriter tocWriter;
+private final Lock lock = new ReentrantLock();
+
+private volatile boolean dirty = false;
+private volatile boolean closed = false;
+
+private int recordsWritten = 0;
+
+public AbstractRecordWriter(final File file, final TocWriter writer) 
throws IOException {
+logger.trace("Creating Record Writer for {}", file);
+
+this.file = file;
+this.tocWriter = writer;
+}
+
+@Override
+public synchronized void close() throws IOException {
+closed = true;
+
+logger.trace("Closing Record Writer for {}", file == null ? null : 
file.getName());
+
+lock();
--- End diff --

I'm confused about the call to lock() as well...  I thought it might be to 
guard tocWriter and the other methods used within close(), but methods like 
isDirty() are called elsewhere without calling lock(), and there's a getter for 
tocWriter.  What is actually being protected by the lock that's not covered by 
the method being synchronized?


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15674135#comment-15674135
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user olegz commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88488867
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/schema/ResourceClaimFieldMap.java
 ---
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.controller.repository.schema;
+
+import org.apache.nifi.controller.repository.claim.ResourceClaim;
+import org.apache.nifi.controller.repository.claim.ResourceClaimManager;
+import org.apache.nifi.repository.schema.Record;
+import org.apache.nifi.repository.schema.RecordSchema;
+
+public class ResourceClaimFieldMap implements Record {
+private final ResourceClaim resourceClaim;
+private final RecordSchema schema;
+
+public ResourceClaimFieldMap(final ResourceClaim resourceClaim, final 
RecordSchema schema) {
+this.resourceClaim = resourceClaim;
--- End diff --

Perhaps null checks,


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15674108#comment-15674108
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user olegz commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88486311
  
--- Diff: 
nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/AbstractRecordWriter.java
 ---
@@ -0,0 +1,173 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.provenance;
+
+import java.io.File;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.util.concurrent.locks.Lock;
+import java.util.concurrent.locks.ReentrantLock;
+
+import org.apache.nifi.provenance.serialization.RecordWriter;
+import org.apache.nifi.provenance.toc.TocWriter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class AbstractRecordWriter implements RecordWriter {
+private static final Logger logger = 
LoggerFactory.getLogger(AbstractRecordWriter.class);
+
+private final File file;
+private final TocWriter tocWriter;
+private final Lock lock = new ReentrantLock();
+
+private volatile boolean dirty = false;
+private volatile boolean closed = false;
+
+private int recordsWritten = 0;
+
+public AbstractRecordWriter(final File file, final TocWriter writer) 
throws IOException {
+logger.trace("Creating Record Writer for {}", file);
+
+this.file = file;
+this.tocWriter = writer;
+}
+
+@Override
+public synchronized void close() throws IOException {
+closed = true;
+
+logger.trace("Closing Record Writer for {}", file == null ? null : 
file.getName());
+
+lock();
--- End diff --

Wondering what is the purpose of explicit locking when _synchronized_ is 
used?


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15674089#comment-15674089
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user olegz commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88484727
  
--- Diff: 
nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/schema/EventRecord.java
 ---
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.provenance.schema;
+
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.nifi.provenance.ProvenanceEventRecord;
+import org.apache.nifi.provenance.ProvenanceEventType;
+import org.apache.nifi.provenance.StandardProvenanceEventRecord;
+import org.apache.nifi.repository.schema.FieldMapRecord;
+import org.apache.nifi.repository.schema.Record;
+import org.apache.nifi.repository.schema.RecordField;
+import org.apache.nifi.repository.schema.RecordSchema;
+
+public class EventRecord implements Record {
--- End diff --

The same visibility comment as for ProvenanceEventSchema


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15674079#comment-15674079
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user olegz commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88484020
  
--- Diff: 
nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/schema/ProvenanceEventSchema.java
 ---
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.provenance.schema;
+
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.ALTERNATE_IDENTIFIER;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.CHILD_UUIDS;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.COMPONENT_ID;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.COMPONENT_TYPE;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.CURRENT_CONTENT_CLAIM;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.EVENT_DETAILS;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.EVENT_DURATION;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.EVENT_TIME;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.EVENT_TYPE;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.FLOWFILE_ENTRY_DATE;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.FLOWFILE_UUID;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.LINEAGE_START_DATE;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.PARENT_UUIDS;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.PREVIOUS_ATTRIBUTES;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.PREVIOUS_CONTENT_CLAIM;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.RECORD_IDENTIFIER;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.RELATIONSHIP;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.SOURCE_QUEUE_IDENTIFIER;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.SOURCE_SYSTEM_FLOWFILE_IDENTIFIER;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.TRANSIT_URI;
+import static 
org.apache.nifi.provenance.schema.EventRecordFields.UPDATED_ATTRIBUTES;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.nifi.repository.schema.RecordField;
+import org.apache.nifi.repository.schema.RecordSchema;
+
+public class ProvenanceEventSchema {
--- End diff --

Perhaps change visibility to package since it is the only place it is used 
while also ensuring that it is not for public consumption until it is.


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671765#comment-15671765
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on the issue:

https://github.com/apache/nifi/pull/1202
  
@olegz I did push a commit that I believe addresses the main concerns here. 
I did not update some of the stylistic changes proposed.


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671582#comment-15671582
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user olegz commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88327262
  
--- Diff: 
nifi-commons/nifi-utils/src/main/java/org/apache/nifi/util/timebuffer/CountSizeEntityAccess.java
 ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.util.timebuffer;
+
+public class CountSizeEntityAccess implements EntityAccess 
{
+@Override
+public TimedCountSize aggregate(final TimedCountSize oldValue, final 
TimedCountSize toAdd) {
+if (oldValue == null && toAdd == null) {
+return new TimedCountSize(0L, 0L);
+} else if (oldValue == null) {
+return toAdd;
+} else if (toAdd == null) {
+return oldValue;
+}
+
+return new TimedCountSize(oldValue.getCount() + toAdd.getCount(), 
oldValue.getSize() + toAdd.getSize());
--- End diff --

you right, if both not null. My bad!


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671552#comment-15671552
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88325829
  
--- Diff: 
nifi-commons/nifi-write-ahead-log/src/main/java/org/wali/MinimalLockingWriteAheadLog.java
 ---
@@ -512,25 +525,38 @@ public synchronized int checkpoint() throws 
IOException {
 swapLocations = new HashSet<>(externalLocations);
 for (final Partition partition : partitions) {
 try {
-partition.rollover();
+partitionStreams.add(partition.rollover());
 } catch (final Throwable t) {
 partition.blackList();
 numberBlackListedPartitions.getAndIncrement();
 throw t;
 }
 }
-
-// notify global sync with the write lock held. We do this 
because we don't want the repository to get updated
-// while the listener is performing its necessary tasks
-if (syncListener != null) {
-syncListener.onGlobalSync();
-}
 } finally {
 writeLock.unlock();
 }
 
 stopTheWorldNanos = System.nanoTime() - stopTheWorldStart;
 
+// Close all of the Partitions' Output Streams. We do this 
here, instead of in Partition.rollover()
+// because we want to do this outside of the write lock. 
Because calling close() on FileOutputStream can
+// be very expensive, as it has to flush the data to disk, we 
don't want to prevent other Process Sessions
+// from getting committed. Since rollover() transitions the 
partition to write to a new file already, there
+// is no reason that we need to close this FileOutputStream 
before releasing the write lock. Also, if any Exception
+// does get thrown when calling close(), we don't need to 
blacklist the partition, as the stream that was getting
+// closed is not the stream being written to for the partition 
anyway.
+for (final OutputStream partitionStream : partitionStreams) {
+partitionStream.close();
--- End diff --

If close() fails we do want to throw an Exception. But you're right - we 
should catch the Exception first and close the other streams so that there is 
no resource leak. Will update that.


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671555#comment-15671555
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88325963
  
--- Diff: 
nifi-commons/nifi-schema-utils/src/main/java/org/apache/nifi/repository/schema/SchemaRecordReader.java
 ---
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.repository.schema;
+
+import java.io.DataInputStream;
+import java.io.EOFException;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+
+
+public class SchemaRecordReader {
+private final RecordSchema schema;
+
+public SchemaRecordReader(final RecordSchema schema) {
+this.schema = schema;
+}
+
+public static SchemaRecordReader fromSchema(final RecordSchema schema) 
{
+return new SchemaRecordReader(schema);
+}
+
+private static void fillBuffer(final InputStream in, final byte[] 
destination) throws IOException {
+int bytesRead = 0;
+int len;
+while (bytesRead < destination.length) {
+len = in.read(destination, bytesRead, destination.length - 
bytesRead);
+if (len < 0) {
+throw new EOFException();
+}
+
+bytesRead += len;
+}
+}
+
+public Record readRecord(final InputStream in) throws IOException {
+final int sentinelByte = in.read();
+if (sentinelByte < 0) {
+return null;
+}
+
+if (sentinelByte != 1) {
+throw new IOException("Expected to read a Sentinel Byte of '1' 
but got a value of '" + sentinelByte + "' instead");
+}
+
+final List schemaFields = schema.getFields();
+final Map fields = new 
HashMap<>(schemaFields.size());
+
+for (final RecordField field : schema.getFields()) {
+final Object value = readField(in, field);
+fields.put(field, value);
+}
+
+return new FieldMapRecord(fields, schema);
+}
+
+
+private Object readField(final InputStream in, final RecordField 
field) throws IOException {
+switch (field.getRepetition()) {
+case ZERO_OR_MORE: {
+// If repetition is 0+ then that means we have a list and 
need to read how many items are in the list.
+final int iterations = readInt(in);
+if (iterations == 0) {
+return Collections.emptyList();
+}
+
+final List value = new ArrayList<>(iterations);
+for (int i = 0; i < iterations; i++) {
+value.add(readFieldValue(in, field.getFieldType(), 
field.getFieldName(), field.getSubFields()));
+}
+
+return value;
+}
+case ZERO_OR_ONE: {
+// If repetition is 0 or 1 (optional), then check if next 
byte is a 0, which means field is absent or 1, which means
+// field is present. Otherwise, throw an Exception.
+final int nextByte = in.read();
+if (nextByte == -1) {
+throw new EOFException("Unexpected End-of-File when 
attempting to read Repetition value for field '" + field.getFieldName() + "'");
+}
+if (nextByte == 0) {
+return null;
+}
+if 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671548#comment-15671548
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88325590
  
--- Diff: 
nifi-commons/nifi-utils/src/main/java/org/apache/nifi/util/timebuffer/CountSizeEntityAccess.java
 ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.util.timebuffer;
+
+public class CountSizeEntityAccess implements EntityAccess 
{
+@Override
+public TimedCountSize aggregate(final TimedCountSize oldValue, final 
TimedCountSize toAdd) {
+if (oldValue == null && toAdd == null) {
+return new TimedCountSize(0L, 0L);
+} else if (oldValue == null) {
+return toAdd;
+} else if (toAdd == null) {
+return oldValue;
+}
+
+return new TimedCountSize(oldValue.getCount() + toAdd.getCount(), 
oldValue.getSize() + toAdd.getSize());
--- End diff --

I don't believe those are the same...


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671531#comment-15671531
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88324567
  
--- Diff: 
nifi-commons/nifi-schema-utils/src/main/java/org/apache/nifi/repository/schema/NamedValue.java
 ---
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.repository.schema;
+
+public class NamedValue {
--- End diff --

We do have Tuple, which I would be inclined to use for a private variable 
but would prefer not to expose that publicly, as NamedValue I believe is more 
explicit in indicating its purpose than a Tuple would be


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671525#comment-15671525
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88324289
  
--- Diff: 
nifi-commons/nifi-schema-utils/src/main/java/org/apache/nifi/repository/schema/FieldMapRecord.java
 ---
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.repository.schema;
+
+import java.util.HashMap;
+import java.util.Map;
+
+public class FieldMapRecord implements Record {
+private final Map values;
+private final RecordSchema schema;
+
+public FieldMapRecord(final Map values, final 
RecordSchema schema) {
+this.schema = schema;
+this.values = convertFieldToName(values);
+}
+
+private static Map convertFieldToName(final 
Map map) {
+final Map nameMap = new HashMap<>(map.size());
+for (final Map.Entry entry : map.entrySet()) {
+nameMap.put(entry.getKey().getFieldName(), entry.getValue());
+}
+return nameMap;
+}
+
+@Override
+public Object getFieldValue(final RecordField field) {
+return values.get(field.getFieldName());
+}
+
+@Override
+public RecordSchema getSchema() {
+return schema;
+}
+
+@Override
+public Object getFieldValue(final String fieldName) {
+return values.get(fieldName);
+}
+
+@Override
+public String toString() {
+return "FieldMapRecord[" + values + "]";
+}
+
+@Override
+public int hashCode() {
+return 33 + 41 * values.hashCode();
+}
+
+@Override
+public boolean equals(final Object obj) {
--- End diff --

It could be but would require exposing private member variables and the 
logic is trivial enough that I don't think it's necessary.


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671520#comment-15671520
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88324004
  
--- Diff: 
nifi-commons/nifi-schema-utils/src/main/java/org/apache/nifi/repository/schema/ComplexRecordField.java
 ---
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.repository.schema;
+
+import java.util.List;
+import java.util.Objects;
+import java.util.stream.Collectors;
+import java.util.stream.Stream;
+
+public class ComplexRecordField implements RecordField {
+private static final FieldType fieldType = FieldType.COMPLEX;
+
+private final String fieldName;
+private final Repetition repetition;
+private final List subFields;
+
+public ComplexRecordField(final String fieldName, final Repetition 
repetition, final RecordField... subFields) {
+this(fieldName, repetition, 
Stream.of(subFields).collect(Collectors.toList()));
--- End diff --

Yes - it is assumed non-null.


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671340#comment-15671340
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user olegz commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88309841
  
--- Diff: 
nifi-commons/nifi-write-ahead-log/src/main/java/org/wali/MinimalLockingWriteAheadLog.java
 ---
@@ -512,25 +525,38 @@ public synchronized int checkpoint() throws 
IOException {
 swapLocations = new HashSet<>(externalLocations);
 for (final Partition partition : partitions) {
 try {
-partition.rollover();
+partitionStreams.add(partition.rollover());
 } catch (final Throwable t) {
 partition.blackList();
 numberBlackListedPartitions.getAndIncrement();
 throw t;
 }
 }
-
-// notify global sync with the write lock held. We do this 
because we don't want the repository to get updated
-// while the listener is performing its necessary tasks
-if (syncListener != null) {
-syncListener.onGlobalSync();
-}
 } finally {
 writeLock.unlock();
 }
 
 stopTheWorldNanos = System.nanoTime() - stopTheWorldStart;
 
+// Close all of the Partitions' Output Streams. We do this 
here, instead of in Partition.rollover()
+// because we want to do this outside of the write lock. 
Because calling close() on FileOutputStream can
+// be very expensive, as it has to flush the data to disk, we 
don't want to prevent other Process Sessions
+// from getting committed. Since rollover() transitions the 
partition to write to a new file already, there
+// is no reason that we need to close this FileOutputStream 
before releasing the write lock. Also, if any Exception
+// does get thrown when calling close(), we don't need to 
blacklist the partition, as the stream that was getting
+// closed is not the stream being written to for the partition 
anyway.
+for (final OutputStream partitionStream : partitionStreams) {
+partitionStream.close();
--- End diff --

What if ```close()``` fails (i.e., IOException) on one of the streams in 
the loop? Perhaps wrapping in try/catch?


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671322#comment-15671322
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user olegz commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88307973
  
--- Diff: 
nifi-commons/nifi-utils/src/main/java/org/apache/nifi/util/timebuffer/CountSizeEntityAccess.java
 ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.util.timebuffer;
+
+public class CountSizeEntityAccess implements EntityAccess 
{
+@Override
+public TimedCountSize aggregate(final TimedCountSize oldValue, final 
TimedCountSize toAdd) {
+if (oldValue == null && toAdd == null) {
+return new TimedCountSize(0L, 0L);
+} else if (oldValue == null) {
+return toAdd;
+} else if (toAdd == null) {
+return oldValue;
+}
+
+return new TimedCountSize(oldValue.getCount() + toAdd.getCount(), 
oldValue.getSize() + toAdd.getSize());
--- End diff --

Isn't this a dead code?
I think the below code would do the same
```
if (oldValue == null && toAdd == null) {
return new TimedCountSize(0L, 0L);
} else if (oldValue == null) {
return toAdd;
} else  {
return oldValue;
}
```



> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671313#comment-15671313
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user olegz commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88307136
  
--- Diff: 
nifi-commons/nifi-utils/src/main/java/org/apache/nifi/stream/io/BufferedInputStream.java
 ---
@@ -16,19 +16,445 @@
  */
 package org.apache.nifi.stream.io;
 
+import java.io.IOException;
 import java.io.InputStream;
 
 /**
  * This class is a slight modification of the BufferedInputStream in the 
java.io package. The modification is that this implementation does not provide 
synchronization on method calls, which means
  * that this class is not suitable for use by multiple threads. However, 
the absence of these synchronized blocks results in potentially much better 
performance.
--- End diff --

I wonder if the performance statement above is actually true. 
Synchronization by itself does not cause significant performance concerns (if 
any). It only comes to play when more then one thread is involved. Perhaps 
investigate and deprecate and use IO BufferedIS?


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671292#comment-15671292
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user olegz commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88305500
  
--- Diff: 
nifi-commons/nifi-schema-utils/src/main/java/org/apache/nifi/repository/schema/SimpleRecordField.java
 ---
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.repository.schema;
+
+import java.util.Collections;
+import java.util.List;
+import java.util.Objects;
+
+public class SimpleRecordField implements RecordField {
--- End diff --

Same as previous comment about abstract class. This one is actually 
identical to ComplexRecordField. The only difference is in _toString()_ and 
even in both cases the string representing the class name (i.e., ```return 
"SimpleRecordField[f. . .```) could itself derive from 
```this.getClass().getSimpleName()```


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671278#comment-15671278
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user olegz commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88304118
  
--- Diff: 
nifi-commons/nifi-schema-utils/src/main/java/org/apache/nifi/repository/schema/SchemaRecordReader.java
 ---
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.repository.schema;
+
+import java.io.DataInputStream;
+import java.io.EOFException;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+
+
+public class SchemaRecordReader {
+private final RecordSchema schema;
+
+public SchemaRecordReader(final RecordSchema schema) {
+this.schema = schema;
+}
+
+public static SchemaRecordReader fromSchema(final RecordSchema schema) 
{
+return new SchemaRecordReader(schema);
+}
+
+private static void fillBuffer(final InputStream in, final byte[] 
destination) throws IOException {
+int bytesRead = 0;
+int len;
+while (bytesRead < destination.length) {
+len = in.read(destination, bytesRead, destination.length - 
bytesRead);
+if (len < 0) {
+throw new EOFException();
+}
+
+bytesRead += len;
+}
+}
+
+public Record readRecord(final InputStream in) throws IOException {
+final int sentinelByte = in.read();
+if (sentinelByte < 0) {
+return null;
+}
+
+if (sentinelByte != 1) {
+throw new IOException("Expected to read a Sentinel Byte of '1' 
but got a value of '" + sentinelByte + "' instead");
+}
+
+final List schemaFields = schema.getFields();
+final Map fields = new 
HashMap<>(schemaFields.size());
+
+for (final RecordField field : schema.getFields()) {
+final Object value = readField(in, field);
+fields.put(field, value);
+}
+
+return new FieldMapRecord(fields, schema);
+}
+
+
+private Object readField(final InputStream in, final RecordField 
field) throws IOException {
+switch (field.getRepetition()) {
+case ZERO_OR_MORE: {
+// If repetition is 0+ then that means we have a list and 
need to read how many items are in the list.
+final int iterations = readInt(in);
+if (iterations == 0) {
+return Collections.emptyList();
+}
+
+final List value = new ArrayList<>(iterations);
+for (int i = 0; i < iterations; i++) {
+value.add(readFieldValue(in, field.getFieldType(), 
field.getFieldName(), field.getSubFields()));
+}
+
+return value;
+}
+case ZERO_OR_ONE: {
+// If repetition is 0 or 1 (optional), then check if next 
byte is a 0, which means field is absent or 1, which means
+// field is present. Otherwise, throw an Exception.
+final int nextByte = in.read();
+if (nextByte == -1) {
+throw new EOFException("Unexpected End-of-File when 
attempting to read Repetition value for field '" + field.getFieldName() + "'");
+}
+if (nextByte == 0) {
+return null;
+}
+if (nextByte 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671263#comment-15671263
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user olegz commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88302396
  
--- Diff: 
nifi-commons/nifi-schema-utils/src/main/java/org/apache/nifi/repository/schema/RecordSchema.java
 ---
@@ -0,0 +1,188 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.repository.schema;
+
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+public class RecordSchema {
+private static final String FIELD_NAME = "Field Name";
+private static final String FIELD_TYPE = "Field Type";
+private static final String REPETITION = "Repetition";
+private static final String SUBFIELDS = "SubFields";
+
+private static final String STRING_TYPE = "String";
+private static final String INT_TYPE = "Integer";
+private static final String LONG_TYPE = "Long";
+private static final String SUBFIELD_TYPE = "SubFieldList";
+
+private final List fields;
+
+public RecordSchema(final List fields) {
+this.fields = fields;
+}
+
+public RecordSchema(final RecordField... fields) {
+this(Arrays.asList(fields));
+}
+
+public List getFields() {
+return fields;
+}
+
+public RecordField getField(final String fieldName) {
+return fields.stream()
+.filter(field -> field.getFieldName().equals(fieldName))
+.findFirst()
+.orElse(null);
+}
+
+public void writeTo(final OutputStream out) throws IOException {
+try {
+final DataOutputStream dos = (out instanceof DataOutputStream) 
? (DataOutputStream) out : new DataOutputStream(out);
+
+dos.writeInt(fields.size());
+for (final RecordField field : fields) {
+writeField(field, dos);
+}
+} catch (final IOException ioe) {
+throw new IOException("Unable to write Record Schema to 
stream", ioe);
+}
+}
+
+private void writeField(final RecordField field, final 
DataOutputStream dos) throws IOException {
+dos.writeInt(4);// A field is made up of 4 "elements": Field 
Name, Field Type, Field Repetition, Sub-Fields.
+
+// For each of the elements, we write a String indicating the 
Element Name, a String indicating the Element Type, and
+// finally the Element data itself.
+dos.writeUTF(FIELD_NAME);
+dos.writeUTF(STRING_TYPE);
+dos.writeUTF(field.getFieldName());
+
+dos.writeUTF(FIELD_TYPE);
+dos.writeUTF(STRING_TYPE);
+dos.writeUTF(field.getFieldType().name());
+
+dos.writeUTF(REPETITION);
+dos.writeUTF(STRING_TYPE);
+dos.writeUTF(field.getRepetition().name());
+
+dos.writeUTF(SUBFIELDS);
+dos.writeUTF(SUBFIELD_TYPE);
+final List subFields = field.getSubFields();
+dos.writeInt(subFields.size()); // SubField is encoded as number 
of Sub-Fields followed by the fields themselves.
+for (final RecordField subField : subFields) {
+writeField(subField, dos);
+}
+}
+
+public static RecordSchema readFrom(final InputStream in) throws 
IOException {
+try {
+final DataInputStream dis = (in instanceof DataInputStream) ? 
(DataInputStream) in : new DataInputStream(in);
 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671164#comment-15671164
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user olegz commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88294269
  
--- Diff: 
nifi-commons/nifi-schema-utils/src/main/java/org/apache/nifi/repository/schema/NamedValue.java
 ---
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.repository.schema;
+
+public class NamedValue {
--- End diff --

Do we actually need this? We already have _Tuple_ 


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671157#comment-15671157
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user olegz commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88293855
  
--- Diff: 
nifi-commons/nifi-schema-utils/src/main/java/org/apache/nifi/repository/schema/MapRecordField.java
 ---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.repository.schema;
+
+import static java.util.Objects.requireNonNull;
+
+import java.util.ArrayList;
+import java.util.List;
+
+public class MapRecordField implements RecordField {
--- End diff --

Same comment as above. . . consider abstract class


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15671125#comment-15671125
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user olegz commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r88291413
  
--- Diff: 
nifi-commons/nifi-schema-utils/src/main/java/org/apache/nifi/repository/schema/ComplexRecordField.java
 ---
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.repository.schema;
+
+import java.util.List;
+import java.util.Objects;
+import java.util.stream.Collectors;
+import java.util.stream.Stream;
+
+public class ComplexRecordField implements RecordField {
+private static final FieldType fieldType = FieldType.COMPLEX;
+
+private final String fieldName;
+private final Repetition repetition;
+private final List subFields;
+
+public ComplexRecordField(final String fieldName, final Repetition 
repetition, final RecordField... subFields) {
+this(fieldName, repetition, 
Stream.of(subFields).collect(Collectors.toList()));
--- End diff --

NPE if null explicitly passed as the last argument


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664338#comment-15664338
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on the issue:

https://github.com/apache/nifi/pull/1202
  
@JPercivall i have pushed a new commit that I believe should address your 
feedback. Thanks!


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664295#comment-15664295
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user joshelser commented on the issue:

https://github.com/apache/nifi/pull/1202
  
Thanks for the thoughtful explanation, @markap14! It's very apparent that 
you have put the thought into this one. Sorry for doubting :)


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664217#comment-15664217
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87824046
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/test/java/org/apache/nifi/controller/swap/TestFlowFile.java
 ---
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.controller.swap;
+
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.concurrent.atomic.AtomicLong;
+
+import org.apache.nifi.controller.repository.FlowFileRecord;
+import org.apache.nifi.controller.repository.claim.ContentClaim;
+import org.apache.nifi.controller.repository.claim.ResourceClaim;
+import org.apache.nifi.controller.repository.claim.ResourceClaimManager;
+import org.apache.nifi.controller.repository.claim.StandardContentClaim;
+import org.apache.nifi.flowfile.FlowFile;
+
+public class TestFlowFile implements FlowFileRecord {
--- End diff --

On second though, after looking at the scope of the class, MockFlowFile 
does work better.


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664174#comment-15664174
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87820804
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/swap/SimpleSwapDeserializer.java
 ---
@@ -0,0 +1,303 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.controller.swap;
+
+import java.io.DataInputStream;
+import java.io.EOFException;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import org.apache.nifi.controller.queue.FlowFileQueue;
+import org.apache.nifi.controller.queue.QueueSize;
+import org.apache.nifi.controller.repository.FlowFileRecord;
+import org.apache.nifi.controller.repository.IncompleteSwapFileException;
+import org.apache.nifi.controller.repository.StandardFlowFileRecord;
+import org.apache.nifi.controller.repository.SwapContents;
+import org.apache.nifi.controller.repository.SwapSummary;
+import org.apache.nifi.controller.repository.claim.ResourceClaim;
+import org.apache.nifi.controller.repository.claim.ResourceClaimManager;
+import org.apache.nifi.controller.repository.claim.StandardContentClaim;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class SimpleSwapDeserializer implements SwapDeserializer {
+public static final int SWAP_ENCODING_VERSION = 10;
+private static final Logger logger = 
LoggerFactory.getLogger(SimpleSwapDeserializer.class);
+
+@Override
+public SwapSummary getSwapSummary(final DataInputStream in, final 
String swapLocation, final ResourceClaimManager claimManager) throws 
IOException {
+final int swapEncodingVersion = in.readInt();
+if (swapEncodingVersion > SWAP_ENCODING_VERSION) {
--- End diff --

Ah ok, misunderstood things and thought 10 was the new format.


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664030#comment-15664030
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87807209
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/test/java/org/apache/nifi/controller/swap/TestSimpleSwapSerializerDeserializer.java
 ---
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.controller.swap;
+
+import static org.junit.Assert.assertEquals;
+
+import java.io.DataInputStream;
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.nio.file.Files;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.UUID;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.nifi.controller.queue.FlowFileQueue;
+import org.apache.nifi.controller.repository.FlowFileRecord;
+import org.apache.nifi.controller.repository.SwapContents;
+import org.apache.nifi.controller.repository.claim.ResourceClaimManager;
+import 
org.apache.nifi.controller.repository.claim.StandardResourceClaimManager;
+import org.apache.nifi.stream.io.NullOutputStream;
+import org.junit.Before;
+import org.junit.Test;
+import org.mockito.Mockito;
+
+public class TestSimpleSwapSerializerDeserializer {
+@Before
+public void setup() {
+TestFlowFile.resetIdGenerator();
+}
+
+@Test
+public void testRoundTripSerializeDeserialize() throws IOException {
+final ResourceClaimManager resourceClaimManager = new 
StandardResourceClaimManager();
+
+final List toSwap = new ArrayList<>(1);
+final Map attrs = new HashMap<>();
+for (int i = 0; i < 1; i++) {
+attrs.put("i", String.valueOf(i));
+final FlowFileRecord ff = new TestFlowFile(attrs, i, 
resourceClaimManager);
+toSwap.add(ff);
+}
+
+final FlowFileQueue flowFileQueue = 
Mockito.mock(FlowFileQueue.class);
+
Mockito.when(flowFileQueue.getIdentifier()).thenReturn("87bb99fe-412c-49f6-a441-d1b0af4e20b4");
+
+final String swapLocation = "target/testRoundTrip-" + 
UUID.randomUUID().toString() + ".swap";
+final File swapFile = new File(swapLocation);
+
+Files.deleteIfExists(swapFile.toPath());
+try {
+final SimpleSwapSerializer serializer = new 
SimpleSwapSerializer();
+try (final FileOutputStream fos = new 
FileOutputStream(swapFile)) {
+serializer.serializeFlowFiles(toSwap, flowFileQueue, 
swapLocation, fos);
+}
+
+final SimpleSwapDeserializer deserializer = new 
SimpleSwapDeserializer();
+final SwapContents swappedIn;
+try (final FileInputStream fis = new FileInputStream(swapFile);
+final DataInputStream dis = new DataInputStream(fis)) {
+swappedIn = deserializer.deserializeFlowFiles(dis, 
swapLocation, flowFileQueue, resourceClaimManager);
+}
+
+assertEquals(toSwap.size(), swappedIn.getFlowFiles().size());
+for (int i = 0; i < toSwap.size(); i++) {
+final FlowFileRecord pre = toSwap.get(i);
+final FlowFileRecord post = 
swappedIn.getFlowFiles().get(i);
+
+assertEquals(pre.getSize(), post.getSize());
+assertEquals(pre.getAttributes(), post.getAttributes());
+assertEquals(pre.getSize(), post.getSize());
+assertEquals(pre.getId(), post.getId());
+ 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664036#comment-15664036
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87807516
  
--- Diff: 
nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/schema/FieldSerializer.java
 ---
@@ -0,0 +1,27 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.provenance.schema;
+
+import java.io.DataOutputStream;
+import java.io.IOException;
+
+import org.apache.nifi.provenance.ProvenanceEventRecord;
+
+public interface FieldSerializer {
--- End diff --

Yes - good catch.


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664029#comment-15664029
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user devriesb commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87807081
  
--- Diff: 
nifi-commons/nifi-utils/src/main/java/org/apache/nifi/stream/io/BufferedInputStream.java
 ---
@@ -16,19 +16,445 @@
  */
 package org.apache.nifi.stream.io;
 
+import java.io.IOException;
 import java.io.InputStream;
 
 /**
  * This class is a slight modification of the BufferedInputStream in the 
java.io package. The modification is that this implementation does not provide 
synchronization on method calls, which means
  * that this class is not suitable for use by multiple threads. However, 
the absence of these synchronized blocks results in potentially much better 
performance.
  */
-public class BufferedInputStream extends java.io.BufferedInputStream {
+public class BufferedInputStream extends InputStream {
--- End diff --

I agree with Joe... we've had instance where eclipse (and I assume IntelliJ 
/ other IDEs) suggest the nifi version of BufferedInputStream, resulting in 
bad, unexpected behavior.  The name is really unfortunate.  Perhaps move the 
functionality to "UnsycnchronizedBufferedInputStream", modify nifi's 
BufferedInputStream to be an empty extension, and deprecate it.  Then complete 
the rename / remove in a future release (like a major version bump, if the 
breaking change is the concern...).


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664017#comment-15664017
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87806443
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/swap/SimpleSwapSerializer.java
 ---
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.controller.swap;
+
+import java.io.DataOutputStream;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.nio.charset.StandardCharsets;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.nifi.controller.queue.FlowFileQueue;
+import org.apache.nifi.controller.repository.FlowFileRecord;
+import org.apache.nifi.controller.repository.claim.ContentClaim;
+import org.apache.nifi.controller.repository.claim.ResourceClaim;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class SimpleSwapSerializer implements SwapSerializer {
--- End diff --

True. Will do so.


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664011#comment-15664011
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87806305
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/WriteAheadRepositoryRecordSerde.java
 ---
@@ -0,0 +1,517 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.controller.repository;
+
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.EOFException;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.nifi.controller.queue.FlowFileQueue;
+import org.apache.nifi.controller.repository.claim.ContentClaim;
+import org.apache.nifi.controller.repository.claim.ResourceClaim;
+import org.apache.nifi.controller.repository.claim.ResourceClaimManager;
+import org.apache.nifi.controller.repository.claim.StandardContentClaim;
+import org.apache.nifi.flowfile.FlowFile;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import org.wali.SerDe;
+import org.wali.UpdateType;
+
+public class WriteAheadRepositoryRecordSerde extends RepositoryRecordSerde 
implements SerDe {
+private static final Logger logger = 
LoggerFactory.getLogger(WriteAheadRepositoryRecordSerde.class);
+
+private static final int CURRENT_ENCODING_VERSION = 9;
+
+public static final byte ACTION_CREATE = 0;
+public static final byte ACTION_UPDATE = 1;
+public static final byte ACTION_DELETE = 2;
+public static final byte ACTION_SWAPPED_OUT = 3;
+public static final byte ACTION_SWAPPED_IN = 4;
+
+private long recordsRestored = 0L;
+private final ResourceClaimManager claimManager;
+
+public WriteAheadRepositoryRecordSerde(final ResourceClaimManager 
claimManager) {
+this.claimManager = claimManager;
+}
+
+@Override
+public void serializeEdit(final RepositoryRecord previousRecordState, 
final RepositoryRecord record, final DataOutputStream out) throws IOException {
+serializeEdit(previousRecordState, record, out, false);
+}
+
+public void serializeEdit(final RepositoryRecord previousRecordState, 
final RepositoryRecord record, final DataOutputStream out, final boolean 
forceAttributesWritten) throws IOException {
+if (record.isMarkedForAbort()) {
+logger.warn("Repository Record {} is marked to be aborted; it 
will be persisted in the FlowFileRepository as a DELETE record", record);
+out.write(ACTION_DELETE);
+out.writeLong(getRecordIdentifier(record));
+serializeContentClaim(record.getCurrentClaim(), 
record.getCurrentClaimOffset(), out);
+return;
+}
+
+final UpdateType updateType = getUpdateType(record);
+
+if (updateType.equals(UpdateType.DELETE)) {
+out.write(ACTION_DELETE);
+out.writeLong(getRecordIdentifier(record));
+serializeContentClaim(record.getCurrentClaim(), 
record.getCurrentClaimOffset(), out);
+return;
+}
+
+// If there's a Destination Connection, that's the one that we 
want to associated with this record.
+// However, on restart, we will restore the FlowFile and set this 
connection to its "originalConnection".
+// If we then serialize the FlowFile again before it's 
transferred, it's important to allow this to happen,
+// so we use the originalConnection instead
+FlowFileQueue associatedQueue = record.getDestination();
+if (associatedQueue == null) {
+ 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663987#comment-15663987
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87804332
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/FileSystemSwapManager.java
 ---
@@ -251,353 +262,36 @@ public SwapSummary getSwapSummary(final String 
swapLocation) throws IOException
 final InputStream bufferedIn = new 
BufferedInputStream(fis);
 final DataInputStream in = new 
DataInputStream(bufferedIn)) {
 
-final int swapEncodingVersion = in.readInt();
-if (swapEncodingVersion > SWAP_ENCODING_VERSION) {
-final String errMsg = "Cannot swap FlowFiles in from " + 
swapFile + " because the encoding version is "
-+ swapEncodingVersion + ", which is too new 
(expecting " + SWAP_ENCODING_VERSION + " or less)";
-
-eventReporter.reportEvent(Severity.ERROR, EVENT_CATEGORY, 
errMsg);
-throw new IOException(errMsg);
-}
-
-final int numRecords;
-final long contentSize;
-Long maxRecordId = null;
-try {
-in.readUTF(); // ignore Connection ID
-numRecords = in.readInt();
-contentSize = in.readLong();
-
-if (numRecords == 0) {
-return StandardSwapSummary.EMPTY_SUMMARY;
-}
-
-if (swapEncodingVersion > 7) {
-maxRecordId = in.readLong();
-}
-} catch (final EOFException eof) {
-logger.warn("Found premature End-of-File when reading Swap 
File {}. EOF occurred before any FlowFiles were encountered", swapLocation);
-return StandardSwapSummary.EMPTY_SUMMARY;
-}
-
-final QueueSize queueSize = new QueueSize(numRecords, 
contentSize);
-final SwapContents swapContents = deserializeFlowFiles(in, 
queueSize, maxRecordId, swapEncodingVersion, true, claimManager, swapLocation);
-return swapContents.getSummary();
-}
-}
-
-public static int serializeFlowFiles(final List 
toSwap, final FlowFileQueue queue, final String swapLocation, final 
OutputStream destination) throws IOException {
-if (toSwap == null || toSwap.isEmpty()) {
-return 0;
-}
-
-long contentSize = 0L;
-for (final FlowFileRecord record : toSwap) {
-contentSize += record.getSize();
-}
-
-// persist record to disk via the swap file
-final OutputStream bufferedOut = new 
BufferedOutputStream(destination);
-final DataOutputStream out = new DataOutputStream(bufferedOut);
-try {
-out.writeInt(SWAP_ENCODING_VERSION);
-out.writeUTF(queue.getIdentifier());
-out.writeInt(toSwap.size());
-out.writeLong(contentSize);
-
-// get the max record id and write that out so that we know it 
quickly for restoration
-long maxRecordId = 0L;
-for (final FlowFileRecord flowFile : toSwap) {
-if (flowFile.getId() > maxRecordId) {
-maxRecordId = flowFile.getId();
-}
-}
-
-out.writeLong(maxRecordId);
-
-for (final FlowFileRecord flowFile : toSwap) {
-out.writeLong(flowFile.getId());
-out.writeLong(flowFile.getEntryDate());
-out.writeLong(flowFile.getLineageStartDate());
-out.writeLong(flowFile.getLineageStartIndex());
-out.writeLong(flowFile.getLastQueueDate());
-out.writeLong(flowFile.getQueueDateIndex());
-out.writeLong(flowFile.getSize());
-
-final ContentClaim claim = flowFile.getContentClaim();
-if (claim == null) {
-out.writeBoolean(false);
-} else {
-out.writeBoolean(true);
-final ResourceClaim resourceClaim = 
claim.getResourceClaim();
-out.writeUTF(resourceClaim.getId());
-out.writeUTF(resourceClaim.getContainer());
-out.writeUTF(resourceClaim.getSection());
-out.writeLong(claim.getOffset());
-

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663986#comment-15663986
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87804314
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/FileSystemSwapManager.java
 ---
@@ -210,30 +215,36 @@ public boolean accept(final File dir, final String 
name) {
 // "--.swap". If we 
have two dashes, then we can just check if the queue ID is equal
 // to the id of the queue given and if not we can just move on.
 final String[] splits = swapFile.getName().split("-");
-if (splits.length == 3) {
-final String queueIdentifier = splits[1];
-if 
(!queueIdentifier.equals(flowFileQueue.getIdentifier())) {
-continue;
+if (splits.length > 6) {
--- End diff --

Yes - this was broken before. It was never noticed because it was a simple 
performance tweak but i noticed this as I was stepping through code.


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658585#comment-15658585
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87675047
  
--- Diff: 
nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/serialization/CompressableRecordReader.java
 ---
@@ -0,0 +1,281 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.provenance.serialization;
+
+import java.io.DataInputStream;
+import java.io.EOFException;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.charset.StandardCharsets;
+import java.util.zip.GZIPInputStream;
+
+import org.apache.nifi.provenance.ProvenanceEventRecord;
+import org.apache.nifi.provenance.StandardProvenanceEventRecord;
+import org.apache.nifi.provenance.StandardRecordReader;
+import org.apache.nifi.provenance.toc.TocReader;
+import org.apache.nifi.stream.io.BufferedInputStream;
+import org.apache.nifi.stream.io.ByteCountingInputStream;
+import org.apache.nifi.stream.io.LimitingInputStream;
+import org.apache.nifi.stream.io.StreamUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class CompressableRecordReader implements RecordReader {
+private static final Logger logger = 
LoggerFactory.getLogger(StandardRecordReader.class);
+
+private final ByteCountingInputStream rawInputStream;
+private final String filename;
+private final int serializationVersion;
+private final boolean compressed;
+private final TocReader tocReader;
+private final int headerLength;
+private final int maxAttributeChars;
+
+private DataInputStream dis;
+private ByteCountingInputStream byteCountingIn;
+
+public CompressableRecordReader(final InputStream in, final String 
filename, final int maxAttributeChars) throws IOException {
+this(in, filename, null, maxAttributeChars);
+}
+
+public CompressableRecordReader(final InputStream in, final String 
filename, final TocReader tocReader, final int maxAttributeChars) throws 
IOException {
+logger.trace("Creating RecordReader for {}", filename);
+
+rawInputStream = new ByteCountingInputStream(in);
+this.maxAttributeChars = maxAttributeChars;
+
+final InputStream limitedStream;
+if (tocReader == null) {
+limitedStream = rawInputStream;
+} else {
+final long offset1 = tocReader.getBlockOffset(1);
+if (offset1 < 0) {
+limitedStream = rawInputStream;
+} else {
+limitedStream = new LimitingInputStream(rawInputStream, 
offset1 - rawInputStream.getBytesConsumed());
+}
+}
+
+final InputStream readableStream;
+if (filename.endsWith(".gz")) {
+readableStream = new BufferedInputStream(new 
GZIPInputStream(limitedStream));
+compressed = true;
+} else {
+readableStream = new BufferedInputStream(limitedStream);
+compressed = false;
+}
+
+byteCountingIn = new ByteCountingInputStream(readableStream);
+dis = new DataInputStream(byteCountingIn);
+
+final String repoClassName = dis.readUTF();
+final int serializationVersion = dis.readInt();
+headerLength = 
repoClassName.getBytes(StandardCharsets.UTF_8).length + 2 + 4; // 2 bytes for 
string length, 4 for integer.
+
+if (serializationVersion < 1 || serializationVersion > 9) {
+throw new IllegalArgumentException("Unable to deserialize 
record because the version is " + serializationVersion + 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658579#comment-15658579
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87674718
  
--- Diff: 
nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/serialization/CompressableRecordReader.java
 ---
@@ -0,0 +1,281 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.provenance.serialization;
+
+import java.io.DataInputStream;
+import java.io.EOFException;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.charset.StandardCharsets;
+import java.util.zip.GZIPInputStream;
+
+import org.apache.nifi.provenance.ProvenanceEventRecord;
+import org.apache.nifi.provenance.StandardProvenanceEventRecord;
+import org.apache.nifi.provenance.StandardRecordReader;
+import org.apache.nifi.provenance.toc.TocReader;
+import org.apache.nifi.stream.io.BufferedInputStream;
+import org.apache.nifi.stream.io.ByteCountingInputStream;
+import org.apache.nifi.stream.io.LimitingInputStream;
+import org.apache.nifi.stream.io.StreamUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class CompressableRecordReader implements RecordReader {
+private static final Logger logger = 
LoggerFactory.getLogger(StandardRecordReader.class);
+
+private final ByteCountingInputStream rawInputStream;
+private final String filename;
+private final int serializationVersion;
+private final boolean compressed;
+private final TocReader tocReader;
+private final int headerLength;
+private final int maxAttributeChars;
+
+private DataInputStream dis;
+private ByteCountingInputStream byteCountingIn;
+
+public CompressableRecordReader(final InputStream in, final String 
filename, final int maxAttributeChars) throws IOException {
+this(in, filename, null, maxAttributeChars);
+}
+
+public CompressableRecordReader(final InputStream in, final String 
filename, final TocReader tocReader, final int maxAttributeChars) throws 
IOException {
+logger.trace("Creating RecordReader for {}", filename);
+
+rawInputStream = new ByteCountingInputStream(in);
+this.maxAttributeChars = maxAttributeChars;
+
+final InputStream limitedStream;
+if (tocReader == null) {
+limitedStream = rawInputStream;
+} else {
+final long offset1 = tocReader.getBlockOffset(1);
+if (offset1 < 0) {
+limitedStream = rawInputStream;
+} else {
+limitedStream = new LimitingInputStream(rawInputStream, 
offset1 - rawInputStream.getBytesConsumed());
+}
+}
+
+final InputStream readableStream;
+if (filename.endsWith(".gz")) {
+readableStream = new BufferedInputStream(new 
GZIPInputStream(limitedStream));
+compressed = true;
+} else {
+readableStream = new BufferedInputStream(limitedStream);
+compressed = false;
+}
+
+byteCountingIn = new ByteCountingInputStream(readableStream);
+dis = new DataInputStream(byteCountingIn);
+
+final String repoClassName = dis.readUTF();
+final int serializationVersion = dis.readInt();
+headerLength = 
repoClassName.getBytes(StandardCharsets.UTF_8).length + 2 + 4; // 2 bytes for 
string length, 4 for integer.
+
+if (serializationVersion < 1 || serializationVersion > 9) {
+throw new IllegalArgumentException("Unable to deserialize 
record because the version is " + serializationVersion + 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658583#comment-15658583
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87668398
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/test/java/org/apache/nifi/controller/repository/TestWriteAheadFlowFileRepository.java
 ---
@@ -64,6 +65,7 @@ public static void setupProperties() {
 }
 
 @Before
+@After
--- End diff --

Why does it need to be cleared before and after?


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658584#comment-15658584
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87661939
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/swap/SimpleSwapDeserializer.java
 ---
@@ -0,0 +1,303 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.controller.swap;
+
+import java.io.DataInputStream;
+import java.io.EOFException;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import org.apache.nifi.controller.queue.FlowFileQueue;
+import org.apache.nifi.controller.queue.QueueSize;
+import org.apache.nifi.controller.repository.FlowFileRecord;
+import org.apache.nifi.controller.repository.IncompleteSwapFileException;
+import org.apache.nifi.controller.repository.StandardFlowFileRecord;
+import org.apache.nifi.controller.repository.SwapContents;
+import org.apache.nifi.controller.repository.SwapSummary;
+import org.apache.nifi.controller.repository.claim.ResourceClaim;
+import org.apache.nifi.controller.repository.claim.ResourceClaimManager;
+import org.apache.nifi.controller.repository.claim.StandardContentClaim;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class SimpleSwapDeserializer implements SwapDeserializer {
+public static final int SWAP_ENCODING_VERSION = 10;
+private static final Logger logger = 
LoggerFactory.getLogger(SimpleSwapDeserializer.class);
+
+@Override
+public SwapSummary getSwapSummary(final DataInputStream in, final 
String swapLocation, final ResourceClaimManager claimManager) throws 
IOException {
+final int swapEncodingVersion = in.readInt();
+if (swapEncodingVersion > SWAP_ENCODING_VERSION) {
--- End diff --

Wouldn't the highest Encoding version this accepts be 9 (where 10 is this 
latest version which utilizes the new format)?


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658572#comment-15658572
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87661134
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/swap/SimpleSwapSerializer.java
 ---
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.controller.swap;
+
+import java.io.DataOutputStream;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.nio.charset.StandardCharsets;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.nifi.controller.queue.FlowFileQueue;
+import org.apache.nifi.controller.repository.FlowFileRecord;
+import org.apache.nifi.controller.repository.claim.ContentClaim;
+import org.apache.nifi.controller.repository.claim.ResourceClaim;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class SimpleSwapSerializer implements SwapSerializer {
--- End diff --

I believe this can be marked deprecated since it will no longer be used (we 
aren't serializing the old format, only deserializing).


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added. 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658582#comment-15658582
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87672258
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/test/java/org/apache/nifi/controller/swap/TestSimpleSwapSerializerDeserializer.java
 ---
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.controller.swap;
+
+import static org.junit.Assert.assertEquals;
+
+import java.io.DataInputStream;
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.nio.file.Files;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.UUID;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.nifi.controller.queue.FlowFileQueue;
+import org.apache.nifi.controller.repository.FlowFileRecord;
+import org.apache.nifi.controller.repository.SwapContents;
+import org.apache.nifi.controller.repository.claim.ResourceClaimManager;
+import 
org.apache.nifi.controller.repository.claim.StandardResourceClaimManager;
+import org.apache.nifi.stream.io.NullOutputStream;
+import org.junit.Before;
+import org.junit.Test;
+import org.mockito.Mockito;
+
+public class TestSimpleSwapSerializerDeserializer {
+@Before
+public void setup() {
+TestFlowFile.resetIdGenerator();
+}
+
+@Test
+public void testRoundTripSerializeDeserialize() throws IOException {
+final ResourceClaimManager resourceClaimManager = new 
StandardResourceClaimManager();
+
+final List toSwap = new ArrayList<>(1);
+final Map attrs = new HashMap<>();
+for (int i = 0; i < 1; i++) {
+attrs.put("i", String.valueOf(i));
+final FlowFileRecord ff = new TestFlowFile(attrs, i, 
resourceClaimManager);
+toSwap.add(ff);
+}
+
+final FlowFileQueue flowFileQueue = 
Mockito.mock(FlowFileQueue.class);
+
Mockito.when(flowFileQueue.getIdentifier()).thenReturn("87bb99fe-412c-49f6-a441-d1b0af4e20b4");
+
+final String swapLocation = "target/testRoundTrip-" + 
UUID.randomUUID().toString() + ".swap";
+final File swapFile = new File(swapLocation);
+
+Files.deleteIfExists(swapFile.toPath());
+try {
+final SimpleSwapSerializer serializer = new 
SimpleSwapSerializer();
+try (final FileOutputStream fos = new 
FileOutputStream(swapFile)) {
+serializer.serializeFlowFiles(toSwap, flowFileQueue, 
swapLocation, fos);
+}
+
+final SimpleSwapDeserializer deserializer = new 
SimpleSwapDeserializer();
+final SwapContents swappedIn;
+try (final FileInputStream fis = new FileInputStream(swapFile);
+final DataInputStream dis = new DataInputStream(fis)) {
+swappedIn = deserializer.deserializeFlowFiles(dis, 
swapLocation, flowFileQueue, resourceClaimManager);
+}
+
+assertEquals(toSwap.size(), swappedIn.getFlowFiles().size());
+for (int i = 0; i < toSwap.size(); i++) {
+final FlowFileRecord pre = toSwap.get(i);
+final FlowFileRecord post = 
swappedIn.getFlowFiles().get(i);
+
+assertEquals(pre.getSize(), post.getSize());
+assertEquals(pre.getAttributes(), post.getAttributes());
+assertEquals(pre.getSize(), post.getSize());
+assertEquals(pre.getId(), post.getId());

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658577#comment-15658577
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87668539
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/test/java/org/apache/nifi/controller/swap/TestFlowFile.java
 ---
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.controller.swap;
+
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.concurrent.atomic.AtomicLong;
+
+import org.apache.nifi.controller.repository.FlowFileRecord;
+import org.apache.nifi.controller.repository.claim.ContentClaim;
+import org.apache.nifi.controller.repository.claim.ResourceClaim;
+import org.apache.nifi.controller.repository.claim.ResourceClaimManager;
+import org.apache.nifi.controller.repository.claim.StandardContentClaim;
+import org.apache.nifi.flowfile.FlowFile;
+
+public class TestFlowFile implements FlowFileRecord {
--- End diff --

Can this be renamed to MockFlowFile? "TestFlowFile" sounds like it is the 
unit tests for FlowFile


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658575#comment-15658575
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87672152
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/test/java/org/apache/nifi/controller/swap/TestSimpleSwapSerializerDeserializer.java
 ---
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.controller.swap;
+
+import static org.junit.Assert.assertEquals;
+
+import java.io.DataInputStream;
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.nio.file.Files;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.UUID;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.nifi.controller.queue.FlowFileQueue;
+import org.apache.nifi.controller.repository.FlowFileRecord;
+import org.apache.nifi.controller.repository.SwapContents;
+import org.apache.nifi.controller.repository.claim.ResourceClaimManager;
+import 
org.apache.nifi.controller.repository.claim.StandardResourceClaimManager;
+import org.apache.nifi.stream.io.NullOutputStream;
+import org.junit.Before;
+import org.junit.Test;
+import org.mockito.Mockito;
+
+public class TestSimpleSwapSerializerDeserializer {
+@Before
+public void setup() {
+TestFlowFile.resetIdGenerator();
+}
+
+@Test
+public void testRoundTripSerializeDeserialize() throws IOException {
+final ResourceClaimManager resourceClaimManager = new 
StandardResourceClaimManager();
+
+final List toSwap = new ArrayList<>(1);
+final Map attrs = new HashMap<>();
+for (int i = 0; i < 1; i++) {
+attrs.put("i", String.valueOf(i));
+final FlowFileRecord ff = new TestFlowFile(attrs, i, 
resourceClaimManager);
+toSwap.add(ff);
+}
+
+final FlowFileQueue flowFileQueue = 
Mockito.mock(FlowFileQueue.class);
+
Mockito.when(flowFileQueue.getIdentifier()).thenReturn("87bb99fe-412c-49f6-a441-d1b0af4e20b4");
+
+final String swapLocation = "target/testRoundTrip-" + 
UUID.randomUUID().toString() + ".swap";
--- End diff --

Shouldn't the UUID in the middle of the swap location be the same as the 
flowFileQueue identifier set above?


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658571#comment-15658571
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87648546
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/StandardProcessSession.java
 ---
@@ -121,8 +121,8 @@
 
 private int removedCount = 0; // number of flowfiles removed in this 
session
 private long removedBytes = 0L; // size of all flowfiles removed in 
this session
-private final AtomicLong bytesRead = new AtomicLong(0L);
-private final AtomicLong bytesWritten = new AtomicLong(0L);
+private long bytesRead = 0L;
+private long bytesWritten = 0L;
--- End diff --

Why change these from AtomicLongs? None of the places where they're 
written/read could be from multiple threads?


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658576#comment-15658576
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87631014
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/WriteAheadRepositoryRecordSerde.java
 ---
@@ -0,0 +1,517 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.controller.repository;
+
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.EOFException;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.nifi.controller.queue.FlowFileQueue;
+import org.apache.nifi.controller.repository.claim.ContentClaim;
+import org.apache.nifi.controller.repository.claim.ResourceClaim;
+import org.apache.nifi.controller.repository.claim.ResourceClaimManager;
+import org.apache.nifi.controller.repository.claim.StandardContentClaim;
+import org.apache.nifi.flowfile.FlowFile;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import org.wali.SerDe;
+import org.wali.UpdateType;
+
+public class WriteAheadRepositoryRecordSerde extends RepositoryRecordSerde 
implements SerDe {
+private static final Logger logger = 
LoggerFactory.getLogger(WriteAheadRepositoryRecordSerde.class);
+
+private static final int CURRENT_ENCODING_VERSION = 9;
+
+public static final byte ACTION_CREATE = 0;
+public static final byte ACTION_UPDATE = 1;
+public static final byte ACTION_DELETE = 2;
+public static final byte ACTION_SWAPPED_OUT = 3;
+public static final byte ACTION_SWAPPED_IN = 4;
+
+private long recordsRestored = 0L;
+private final ResourceClaimManager claimManager;
+
+public WriteAheadRepositoryRecordSerde(final ResourceClaimManager 
claimManager) {
+this.claimManager = claimManager;
+}
+
+@Override
+public void serializeEdit(final RepositoryRecord previousRecordState, 
final RepositoryRecord record, final DataOutputStream out) throws IOException {
+serializeEdit(previousRecordState, record, out, false);
+}
+
+public void serializeEdit(final RepositoryRecord previousRecordState, 
final RepositoryRecord record, final DataOutputStream out, final boolean 
forceAttributesWritten) throws IOException {
+if (record.isMarkedForAbort()) {
+logger.warn("Repository Record {} is marked to be aborted; it 
will be persisted in the FlowFileRepository as a DELETE record", record);
+out.write(ACTION_DELETE);
+out.writeLong(getRecordIdentifier(record));
+serializeContentClaim(record.getCurrentClaim(), 
record.getCurrentClaimOffset(), out);
+return;
+}
+
+final UpdateType updateType = getUpdateType(record);
+
+if (updateType.equals(UpdateType.DELETE)) {
+out.write(ACTION_DELETE);
+out.writeLong(getRecordIdentifier(record));
+serializeContentClaim(record.getCurrentClaim(), 
record.getCurrentClaimOffset(), out);
+return;
+}
+
+// If there's a Destination Connection, that's the one that we 
want to associated with this record.
+// However, on restart, we will restore the FlowFile and set this 
connection to its "originalConnection".
+// If we then serialize the FlowFile again before it's 
transferred, it's important to allow this to happen,
+// so we use the originalConnection instead
+FlowFileQueue associatedQueue = record.getDestination();
+if (associatedQueue == null) {
+   

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658574#comment-15658574
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87677686
  
--- Diff: 
nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/test/java/org/apache/nifi/provenance/TestPersistentProvenanceRepository.java
 ---
@@ -1914,112 +1916,6 @@ public void 
testFailureToCreateWriterDoesNotPreventSubsequentRollover() throws I
 }
 
 
-@Test
-public void testBehaviorOnOutOfMemory() throws IOException, 
InterruptedException {
--- End diff --

Why was this test removed?


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658581#comment-15658581
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87657911
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/FileSystemSwapManager.java
 ---
@@ -251,353 +262,36 @@ public SwapSummary getSwapSummary(final String 
swapLocation) throws IOException
 final InputStream bufferedIn = new 
BufferedInputStream(fis);
 final DataInputStream in = new 
DataInputStream(bufferedIn)) {
 
-final int swapEncodingVersion = in.readInt();
-if (swapEncodingVersion > SWAP_ENCODING_VERSION) {
-final String errMsg = "Cannot swap FlowFiles in from " + 
swapFile + " because the encoding version is "
-+ swapEncodingVersion + ", which is too new 
(expecting " + SWAP_ENCODING_VERSION + " or less)";
-
-eventReporter.reportEvent(Severity.ERROR, EVENT_CATEGORY, 
errMsg);
-throw new IOException(errMsg);
-}
-
-final int numRecords;
-final long contentSize;
-Long maxRecordId = null;
-try {
-in.readUTF(); // ignore Connection ID
-numRecords = in.readInt();
-contentSize = in.readLong();
-
-if (numRecords == 0) {
-return StandardSwapSummary.EMPTY_SUMMARY;
-}
-
-if (swapEncodingVersion > 7) {
-maxRecordId = in.readLong();
-}
-} catch (final EOFException eof) {
-logger.warn("Found premature End-of-File when reading Swap 
File {}. EOF occurred before any FlowFiles were encountered", swapLocation);
-return StandardSwapSummary.EMPTY_SUMMARY;
-}
-
-final QueueSize queueSize = new QueueSize(numRecords, 
contentSize);
-final SwapContents swapContents = deserializeFlowFiles(in, 
queueSize, maxRecordId, swapEncodingVersion, true, claimManager, swapLocation);
-return swapContents.getSummary();
-}
-}
-
-public static int serializeFlowFiles(final List 
toSwap, final FlowFileQueue queue, final String swapLocation, final 
OutputStream destination) throws IOException {
-if (toSwap == null || toSwap.isEmpty()) {
-return 0;
-}
-
-long contentSize = 0L;
-for (final FlowFileRecord record : toSwap) {
-contentSize += record.getSize();
-}
-
-// persist record to disk via the swap file
-final OutputStream bufferedOut = new 
BufferedOutputStream(destination);
-final DataOutputStream out = new DataOutputStream(bufferedOut);
-try {
-out.writeInt(SWAP_ENCODING_VERSION);
-out.writeUTF(queue.getIdentifier());
-out.writeInt(toSwap.size());
-out.writeLong(contentSize);
-
-// get the max record id and write that out so that we know it 
quickly for restoration
-long maxRecordId = 0L;
-for (final FlowFileRecord flowFile : toSwap) {
-if (flowFile.getId() > maxRecordId) {
-maxRecordId = flowFile.getId();
-}
-}
-
-out.writeLong(maxRecordId);
-
-for (final FlowFileRecord flowFile : toSwap) {
-out.writeLong(flowFile.getId());
-out.writeLong(flowFile.getEntryDate());
-out.writeLong(flowFile.getLineageStartDate());
-out.writeLong(flowFile.getLineageStartIndex());
-out.writeLong(flowFile.getLastQueueDate());
-out.writeLong(flowFile.getQueueDateIndex());
-out.writeLong(flowFile.getSize());
-
-final ContentClaim claim = flowFile.getContentClaim();
-if (claim == null) {
-out.writeBoolean(false);
-} else {
-out.writeBoolean(true);
-final ResourceClaim resourceClaim = 
claim.getResourceClaim();
-out.writeUTF(resourceClaim.getId());
-out.writeUTF(resourceClaim.getContainer());
-out.writeUTF(resourceClaim.getSection());
-out.writeLong(claim.getOffset());
-  

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658573#comment-15658573
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87629214
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/WriteAheadRepositoryRecordSerde.java
 ---
@@ -0,0 +1,517 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.controller.repository;
+
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.EOFException;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.nifi.controller.queue.FlowFileQueue;
+import org.apache.nifi.controller.repository.claim.ContentClaim;
+import org.apache.nifi.controller.repository.claim.ResourceClaim;
+import org.apache.nifi.controller.repository.claim.ResourceClaimManager;
+import org.apache.nifi.controller.repository.claim.StandardContentClaim;
+import org.apache.nifi.flowfile.FlowFile;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import org.wali.SerDe;
+import org.wali.UpdateType;
+
+public class WriteAheadRepositoryRecordSerde extends RepositoryRecordSerde 
implements SerDe {
+private static final Logger logger = 
LoggerFactory.getLogger(WriteAheadRepositoryRecordSerde.class);
+
+private static final int CURRENT_ENCODING_VERSION = 9;
+
+public static final byte ACTION_CREATE = 0;
+public static final byte ACTION_UPDATE = 1;
+public static final byte ACTION_DELETE = 2;
+public static final byte ACTION_SWAPPED_OUT = 3;
+public static final byte ACTION_SWAPPED_IN = 4;
+
+private long recordsRestored = 0L;
+private final ResourceClaimManager claimManager;
+
+public WriteAheadRepositoryRecordSerde(final ResourceClaimManager 
claimManager) {
+this.claimManager = claimManager;
+}
+
+@Override
+public void serializeEdit(final RepositoryRecord previousRecordState, 
final RepositoryRecord record, final DataOutputStream out) throws IOException {
+serializeEdit(previousRecordState, record, out, false);
+}
+
+public void serializeEdit(final RepositoryRecord previousRecordState, 
final RepositoryRecord record, final DataOutputStream out, final boolean 
forceAttributesWritten) throws IOException {
+if (record.isMarkedForAbort()) {
+logger.warn("Repository Record {} is marked to be aborted; it 
will be persisted in the FlowFileRepository as a DELETE record", record);
+out.write(ACTION_DELETE);
+out.writeLong(getRecordIdentifier(record));
+serializeContentClaim(record.getCurrentClaim(), 
record.getCurrentClaimOffset(), out);
+return;
+}
+
+final UpdateType updateType = getUpdateType(record);
+
+if (updateType.equals(UpdateType.DELETE)) {
+out.write(ACTION_DELETE);
+out.writeLong(getRecordIdentifier(record));
+serializeContentClaim(record.getCurrentClaim(), 
record.getCurrentClaimOffset(), out);
+return;
+}
+
+// If there's a Destination Connection, that's the one that we 
want to associated with this record.
+// However, on restart, we will restore the FlowFile and set this 
connection to its "originalConnection".
+// If we then serialize the FlowFile again before it's 
transferred, it's important to allow this to happen,
+// so we use the originalConnection instead
+FlowFileQueue associatedQueue = record.getDestination();
+if (associatedQueue == null) {
+   

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658580#comment-15658580
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87677086
  
--- Diff: 
nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/schema/FieldSerializer.java
 ---
@@ -0,0 +1,27 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.provenance.schema;
+
+import java.io.DataOutputStream;
+import java.io.IOException;
+
+import org.apache.nifi.provenance.ProvenanceEventRecord;
+
+public interface FieldSerializer {
--- End diff --

Intellij tells me this class isn't used, remnant of design iterations?


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658578#comment-15658578
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87654483
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/FileSystemSwapManager.java
 ---
@@ -210,30 +215,36 @@ public boolean accept(final File dir, final String 
name) {
 // "--.swap". If we 
have two dashes, then we can just check if the queue ID is equal
 // to the id of the queue given and if not we can just move on.
 final String[] splits = swapFile.getName().split("-");
-if (splits.length == 3) {
-final String queueIdentifier = splits[1];
-if 
(!queueIdentifier.equals(flowFileQueue.getIdentifier())) {
-continue;
+if (splits.length > 6) {
--- End diff --

Was this broken before? The ID scheme for queue hasn't changed (still a 
UUID)


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658555#comment-15658555
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user joshelser commented on the issue:

https://github.com/apache/nifi/pull/1202
  
Have you considered using a library such as Google Protocol Buffers to 
reduce the debt on NiFi in maintaining custom serialization logic?


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657694#comment-15657694
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87631992
  
--- Diff: 
nifi-commons/nifi-schema-utils/src/main/java/org/apache/nifi/repository/schema/Requirement.java
 ---
@@ -0,0 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.repository.schema;
+
+public enum Requirement {
--- End diff --

I believe you are correct.


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657696#comment-15657696
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87632081
  
--- Diff: 
nifi-commons/nifi-schema-utils/src/main/java/org/apache/nifi/repository/schema/RecordSchema.java
 ---
@@ -0,0 +1,183 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.repository.schema;
+
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+public class RecordSchema {
+private static final String FIELD_NAME = "Field Name";
+private static final String FIELD_TYPE = "Field Type";
+private static final String REPETITION = "Repetition";
+private static final String SUBFIELDS = "SubFields";
+
+private static final String STRING_TYPE = "String";
+private static final String INT_TYPE = "Integer";
+private static final String LONG_TYPE = "Long";
+private static final String SUBFIELD_TYPE = "SubFieldList";
+
+private final List fields;
+
+public RecordSchema(final List fields) {
+this.fields = fields;
+}
+
+public RecordSchema(final RecordField... fields) {
+this(Arrays.asList(fields));
+}
+
+public List getFields() {
+return fields;
+}
+
+public RecordField getField(final String fieldName) {
+return fields.stream()
+.filter(field -> field.getFieldName().equals(fieldName))
+.findFirst()
+.orElse(null);
+}
+
+public void writeTo(final OutputStream out) throws IOException {
+try {
+final DataOutputStream dos = (out instanceof DataOutputStream) 
? (DataOutputStream) out : new DataOutputStream(out);
+
+dos.writeInt(fields.size());
+for (final RecordField field : fields) {
+writeField(field, dos);
+}
+} catch (final IOException ioe) {
+throw new IOException("Unable to write Record Schema to 
stream", ioe);
+}
+}
+
+private void writeField(final RecordField field, final 
DataOutputStream dos) throws IOException {
+dos.writeInt(4);// 4 fields.
--- End diff --

Thanks for pointing this out. Will re-word the comments.


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657693#comment-15657693
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87631938
  
--- Diff: 
nifi-commons/nifi-schema-utils/src/main/java/org/apache/nifi/repository/schema/UnionRecordField.java
 ---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.repository.schema;
+
+import java.util.Arrays;
+import java.util.List;
+
+public class UnionRecordField implements RecordField {
+private final String fieldName;
+private final Repetition repetition;
+private final List possibilities;
+
+public UnionRecordField(final String fieldName, final Repetition 
repetition, final RecordField... possibilities) {
+this(fieldName, repetition, Arrays.asList(possibilities));
+}
+
+public UnionRecordField(final String fieldName, final Repetition 
repetition, final List possibilities) {
--- End diff --

Good call. Will add requireNonNull checks.


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657690#comment-15657690
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87631735
  
--- Diff: 
nifi-framework-api/src/main/java/org/apache/nifi/controller/repository/claim/ResourceClaim.java
 ---
@@ -64,4 +64,28 @@
  * @return true if the Resource Claim is in use, 
false otherwise
  */
 boolean isInUse();
+
+
+/**
+ * Provides the natural ordering for ResourceClaim objects. By default 
they are sorted by their id, then container, then section
+ *
+ * @param other other claim
+ * @return x such that x <=1 if this is less than other;
--- End diff --

Hmmm this is a very odd comment in the code. will address.


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.1.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655771#comment-15655771
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87520923
  
--- Diff: 
nifi-commons/nifi-schema-utils/src/main/java/org/apache/nifi/repository/schema/RecordSchema.java
 ---
@@ -0,0 +1,183 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.repository.schema;
+
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+public class RecordSchema {
+private static final String FIELD_NAME = "Field Name";
+private static final String FIELD_TYPE = "Field Type";
+private static final String REPETITION = "Repetition";
+private static final String SUBFIELDS = "SubFields";
+
+private static final String STRING_TYPE = "String";
+private static final String INT_TYPE = "Integer";
+private static final String LONG_TYPE = "Long";
+private static final String SUBFIELD_TYPE = "SubFieldList";
+
+private final List fields;
+
+public RecordSchema(final List fields) {
+this.fields = fields;
+}
+
+public RecordSchema(final RecordField... fields) {
+this(Arrays.asList(fields));
+}
+
+public List getFields() {
+return fields;
+}
+
+public RecordField getField(final String fieldName) {
+return fields.stream()
+.filter(field -> field.getFieldName().equals(fieldName))
+.findFirst()
+.orElse(null);
+}
+
+public void writeTo(final OutputStream out) throws IOException {
+try {
+final DataOutputStream dos = (out instanceof DataOutputStream) 
? (DataOutputStream) out : new DataOutputStream(out);
+
+dos.writeInt(fields.size());
+for (final RecordField field : fields) {
+writeField(field, dos);
+}
+} catch (final IOException ioe) {
+throw new IOException("Unable to write Record Schema to 
stream", ioe);
+}
+}
+
+private void writeField(final RecordField field, final 
DataOutputStream dos) throws IOException {
+dos.writeInt(4);// 4 fields.
--- End diff --

This comment is/was very confusing to me.  (I believe) this is writing the 
4 values defining a field, not actually writing 4 fields.

This goes hand-in-hand with the loop in "readField" (that it's not reading 
multiple fields but is reading the values defining a field).


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.2.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655775#comment-15655775
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87510987
  
--- Diff: 
nifi-commons/nifi-write-ahead-log/src/main/java/org/wali/SingletonSerDeFactory.java
 ---
@@ -0,0 +1,46 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.wali;
+
+public class SingletonSerDeFactory implements SerDeFactory {
+private final SerDe serde;
+
+public SingletonSerDeFactory(final SerDe serde) {
+this.serde = serde;
+}
+
+@Override
+public SerDe createSerDe(final String encodingName) {
+return serde;
--- End diff --

Ignoring the "encodingName" parameter seems wrong. Any way to check if the 
serde set matches the passed encoding name?


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.2.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655776#comment-15655776
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87486068
  
--- Diff: 
nifi-framework-api/src/main/java/org/apache/nifi/controller/repository/claim/ResourceClaim.java
 ---
@@ -64,4 +64,28 @@
  * @return true if the Resource Claim is in use, 
false otherwise
  */
 boolean isInUse();
+
+
+/**
+ * Provides the natural ordering for ResourceClaim objects. By default 
they are sorted by their id, then container, then section
+ *
+ * @param other other claim
+ * @return x such that x <=1 if this is less than other;
--- End diff --

I believe this should be "-1"


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.2.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655773#comment-15655773
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87487455
  
--- Diff: 
nifi-commons/nifi-utils/src/main/java/org/apache/nifi/stream/io/BufferedInputStream.java
 ---
@@ -16,19 +16,445 @@
  */
 package org.apache.nifi.stream.io;
 
+import java.io.IOException;
 import java.io.InputStream;
 
 /**
  * This class is a slight modification of the BufferedInputStream in the 
java.io package. The modification is that this implementation does not provide 
synchronization on method calls, which means
  * that this class is not suitable for use by multiple threads. However, 
the absence of these synchronized blocks results in potentially much better 
performance.
  */
-public class BufferedInputStream extends java.io.BufferedInputStream {
+public class BufferedInputStream extends InputStream {
--- End diff --

Can this be renamed to explicitly call out that this is not synchronized? 
This is a part of nifi-utils and I could easily see someone using it 
unknowingly because their IDE gives them the option. Or since it is part of 
nifi-utils that makes it public and can't be changed?


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.2.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655772#comment-15655772
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87517252
  
--- Diff: 
nifi-commons/nifi-schema-utils/src/main/java/org/apache/nifi/repository/schema/UnionRecordField.java
 ---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.repository.schema;
+
+import java.util.Arrays;
+import java.util.List;
+
+public class UnionRecordField implements RecordField {
+private final String fieldName;
+private final Repetition repetition;
+private final List possibilities;
+
+public UnionRecordField(final String fieldName, final Repetition 
repetition, final RecordField... possibilities) {
+this(fieldName, repetition, Arrays.asList(possibilities));
+}
+
+public UnionRecordField(final String fieldName, final Repetition 
repetition, final List possibilities) {
--- End diff --

ComplexRecordField and SimpleRecordField call "Objects.requireNonNull" on 
their three inputs but "UnionRecordField" and "MapRecordField" do not. Is that 
intended?


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.2.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to 

[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655774#comment-15655774
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on a diff in the pull request:

https://github.com/apache/nifi/pull/1202#discussion_r87517591
  
--- Diff: 
nifi-commons/nifi-schema-utils/src/main/java/org/apache/nifi/repository/schema/Requirement.java
 ---
@@ -0,0 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.repository.schema;
+
+public enum Requirement {
--- End diff --

I believe this is leftover from design iterations and it's functionality 
has been merged into "Repetition"


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.2.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655132#comment-15655132
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on the issue:

https://github.com/apache/nifi/pull/1202
  
Here are the failures:

```
Unapproved licenses:
  
/Users/jpercivall/projects/nifi/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/hs_err_pid36001.log
  
/Users/jpercivall/projects/nifi/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/hs_err_pid83554.log
```


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.2.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655114#comment-15655114
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on the issue:

https://github.com/apache/nifi/pull/1202
  
Looks like the PR failed RAT check in travis


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.2.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655112#comment-15655112
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

Github user JPercivall commented on the issue:

https://github.com/apache/nifi/pull/1202
  
Reviewing


> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.2.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to the swap file header may allow some variation 
> there but the variation should only be hints to optimize how they're 
> processed and not change their behavior otherwise. Changes are only permitted 
> during minor version releases.
> * Provenance repository changes are only permitted during minor version 
> releases.  These changes may include adding or removing fields from existing 
> event types.  If a field is considered required it must always be considered 
> required.  If a field is removed then it must not be a required field and 
> there must be a sensible default an older version could use if that value is 
> not found in new data once rolled back.  New event types may be added.  
> Fields or event types not known to older version, if seen after a rollback, 
> will simply be ignored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2854) Enable repositories to support upgrades and rollback in well defined scenarios

2016-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654799#comment-15654799
 ] 

ASF GitHub Bot commented on NIFI-2854:
--

GitHub user markap14 opened a pull request:

https://github.com/apache/nifi/pull/1202

NIFI-2854: Refactor repositories and swap files to use schema-based s…

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.

…erialization so that nifi can be rolled back to a previous version after 
an upgrade.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/markap14/nifi NIFI-2854

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/1202.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1202


commit 5167505596e505ac89ed1fe975388ad2f35a963e
Author: Mark Payne 
Date:   2016-10-04T13:38:14Z

NIFI-2854: Refactor repositories and swap files to use schema-based 
serialization so that nifi can be rolled back to a previous version after an 
upgrade.




> Enable repositories to support upgrades and rollback in well defined scenarios
> --
>
> Key: NIFI-2854
> URL: https://issues.apache.org/jira/browse/NIFI-2854
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.2.0
>
>
> The flowfile, swapfile, provenance, and content repositories play a very 
> important roll in NiFi's ability to be safely upgraded and rolled back.  We 
> need to have well documented behaviors, designs, and version adherence so 
> that users can safely rely on these mechanisms.
> Once this is formalized and in place we should update our versioning guidance 
> to reflect this as well.
> The following would be true from NiFi 1.2.0 onward
> * No changes to how the repositories are persisted to disk can be made which 
> will break forward/backward compatibility and specifically this means that 
> things like the way each is serialized to disk cannot change.
> * If changes are made which impact forward or backward compatibility they 
> should be reserved for major releases only and should include a utility to 
> help users with pre-existing data convert from some older format to the newer 
> format.  It may not be feasible to have rollback on major releases.
> * The content repository should not be changed within a major release cycle 
> in any way that will harm forward or backward compatibility.
> * The flow file repository can change in that new fields can be added to 
> existing write ahead log record types but no fields can be removed nor can 
> any new types be added.  Once a field is considered required it must remain 
> required.  Changes may only be made across minor version changes - not 
> incremental.
> * Swap File storage should follow very similar rules to the flow file 
> repository.  Adding a schema to