[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877223#comment-15877223
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-tephra/pull/37


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.11.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data as expected since new 
> snapshots are searialized using the {{List}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877100#comment-15877100
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user chtyim commented on the issue:

https://github.com/apache/incubator-tephra/pull/37
  
LGTM


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data as expected since new 
> snapshots are searialized using the {{List}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877028#comment-15877028
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user poornachandra commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102348853
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/TransactionManager.java ---
@@ -1123,15 +1112,14 @@ private boolean doTruncateInvalidTxBefore(long 
time) throws InvalidTruncateTimeE
 }
 
 // Find all invalid transactions earlier than truncateWp
-Set toTruncate = Sets.newHashSet();
-for (long wp : invalid) {
-  // invalid list is sorted, hence can stop as soon as we reach a wp 
>= truncateWp
-  if (wp >= truncateWp) {
-break;
+LongSet toTruncate = new LongArraySet();
+for (long wp : invalidTxList.toRawList()) {
--- End diff --

Good point


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data as expected since new 
> snapshots are searialized using the {{List}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877027#comment-15877027
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user chtyim commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102348469
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/TransactionManager.java ---
@@ -1123,15 +1112,14 @@ private boolean doTruncateInvalidTxBefore(long 
time) throws InvalidTruncateTimeE
 }
 
 // Find all invalid transactions earlier than truncateWp
-Set toTruncate = Sets.newHashSet();
-for (long wp : invalid) {
-  // invalid list is sorted, hence can stop as soon as we reach a wp 
>= truncateWp
-  if (wp >= truncateWp) {
-break;
+LongSet toTruncate = new LongArraySet();
+for (long wp : invalidTxList.toRawList()) {
--- End diff --

Unfortunately using the normal for-each loop means there would be boxing. 
To avoid boxing, you have to use the iterator directly

```java
LongIterator it = invalidTxList.toRawList().iterator();
while (it.hasNext()) {
  long wp = it.nextLong();
  ...
}
```


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data as expected since new 
> snapshots are searialized using the {{List}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877004#comment-15877004
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user poornachandra commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102346930
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/TransactionManager.java ---
@@ -1123,15 +1112,14 @@ private boolean doTruncateInvalidTxBefore(long 
time) throws InvalidTruncateTimeE
 }
 
 // Find all invalid transactions earlier than truncateWp
-Set toTruncate = Sets.newHashSet();
-for (long wp : invalid) {
-  // invalid list is sorted, hence can stop as soon as we reach a wp 
>= truncateWp
-  if (wp >= truncateWp) {
-break;
+LongSet toTruncate = new LongArraySet();
+for (long wp : invalidTxList.toRawList()) {
--- End diff --

I reversed the check and iterate through the whole invalid list now, so 
sorting is no longer necessary.


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data as expected since new 
> snapshots are searialized using the {{List}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876965#comment-15876965
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user chtyim commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102343745
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/TransactionManager.java ---
@@ -1123,15 +1112,14 @@ private boolean doTruncateInvalidTxBefore(long 
time) throws InvalidTruncateTimeE
 }
 
 // Find all invalid transactions earlier than truncateWp
-Set toTruncate = Sets.newHashSet();
-for (long wp : invalid) {
-  // invalid list is sorted, hence can stop as soon as we reach a wp 
>= truncateWp
-  if (wp >= truncateWp) {
-break;
+LongSet toTruncate = new LongArraySet();
+for (long wp : invalidTxList.toRawList()) {
--- End diff --

Shouldn't this use `toSortedArray`??


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data as expected since new 
> snapshots are searialized using the {{List}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876961#comment-15876961
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user chtyim commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102343334
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/manager/InvalidTxList.java ---
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.tephra.manager;
+
+import it.unimi.dsi.fastutil.longs.LongArrayList;
+import it.unimi.dsi.fastutil.longs.LongCollection;
+import it.unimi.dsi.fastutil.longs.LongList;
+import org.apache.tephra.TransactionManager;
+
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import javax.annotation.concurrent.NotThreadSafe;
+
+/**
+ * This is an internal class used by the {@link TransactionManager} to 
store invalid transaction ids.
+ * This class uses both a list and an array to keep track of the invalid 
ids. The list is the primary
+ * data structure for storing the invalid ids. The array is populated 
lazily on changes to the list.
+ * The array is used to avoid creating a new array every time method 
{@link #toSortedArray()} is invoked.
+ *
+ * This class is not thread safe and relies on external synchronization. 
TransactionManager always
+ * accesses an instance of this class after synchronization.
+ */
+@NotThreadSafe
+public class InvalidTxList {
+  private static final long[] NO_INVALID_TX = { };
+
+  private final LongList invalid = new LongArrayList();
+  private long[] invalidArray = NO_INVALID_TX;
+
+  private boolean dirty = false; // used to track changes to the invalid 
list
+
+  public int size() {
+return invalid.size();
+  }
+
+  public boolean isEmpty() {
+return invalid.isEmpty();
+  }
+
+  public boolean add(long id) {
+boolean changed = invalid.add(id);
+dirty = dirty || changed;
+return changed;
+  }
+
+  public boolean addAll(Collection ids) {
+boolean changed = invalid.addAll(ids);
+dirty = dirty || changed;
+return changed;
+  }
+
+  public boolean addAll(LongCollection ids) {
--- End diff --

Should use `LongList` instead of `LongCollection` as the `LongArrayList` 
only optimize if it is a `LongList`.


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove 

[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876833#comment-15876833
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user poornachandra commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102330757
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/manager/InvalidTxList.java ---
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.tephra.manager;
+
+import it.unimi.dsi.fastutil.longs.LongArrayList;
+import org.apache.tephra.TransactionManager;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import javax.annotation.concurrent.NotThreadSafe;
+
+/**
+ * This is an internal class used by the {@link TransactionManager} to 
store invalid transaction ids.
+ * This class uses both a list and an array to keep track of the invalid 
ids. The list is the primary
+ * data structure for storing the invalid ids. The array is populated 
lazily on changes to the list.
+ * The array is used to avoid creating a new array every time method 
{@link #toArray()} is invoked.
+ *
+ * This class is not thread safe and relies on external synchronization. 
TransactionManager always
+ * accesses an instance of this class after synchronization.
+ */
+@NotThreadSafe
+public class InvalidTxList {
+  private static final long[] NO_INVALID_TX = { };
+
+  private final LongArrayList invalid = new LongArrayList();
+  private long[] invalidArray = NO_INVALID_TX;
+
+  private boolean dirty = false; // used to track changes to the invalid 
list
+
+  public int size() {
+return invalid.size();
+  }
+
+  public boolean isEmpty() {
+return invalid.isEmpty();
+  }
+
+  public boolean add(long id) {
+boolean changed = invalid.add(id);
+dirty = dirty || changed;
+return changed;
+  }
+
+  public boolean addAll(Collection ids) {
--- End diff --

Done


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data as expected since new 
> snapshots are searialized using the {{List}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876834#comment-15876834
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user poornachandra commented on the issue:

https://github.com/apache/incubator-tephra/pull/37
  
@chtyim I have addressed the comments, please take a look


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data as expected since new 
> snapshots are searialized using the {{List}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876473#comment-15876473
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user chtyim commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102284887
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/manager/InvalidTxList.java ---
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.tephra.manager;
+
+import it.unimi.dsi.fastutil.longs.LongArrayList;
+import org.apache.tephra.TransactionManager;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import javax.annotation.concurrent.NotThreadSafe;
+
+/**
+ * This is an internal class used by the {@link TransactionManager} to 
store invalid transaction ids.
+ * This class uses both a list and an array to keep track of the invalid 
ids. The list is the primary
+ * data structure for storing the invalid ids. The array is populated 
lazily on changes to the list.
+ * The array is used to avoid creating a new array every time method 
{@link #toArray()} is invoked.
+ *
+ * This class is not thread safe and relies on external synchronization. 
TransactionManager always
+ * accesses an instance of this class after synchronization.
+ */
+@NotThreadSafe
+public class InvalidTxList {
+  private static final long[] NO_INVALID_TX = { };
+
+  private final LongArrayList invalid = new LongArrayList();
+  private long[] invalidArray = NO_INVALID_TX;
+
+  private boolean dirty = false; // used to track changes to the invalid 
list
+
+  public int size() {
+return invalid.size();
+  }
+
+  public boolean isEmpty() {
+return invalid.isEmpty();
+  }
+
+  public boolean add(long id) {
+boolean changed = invalid.add(id);
+dirty = dirty || changed;
+return changed;
+  }
+
+  public boolean addAll(Collection ids) {
+boolean changed = invalid.addAll(ids);
+dirty = dirty || changed;
+return changed;
+  }
+
+  public boolean contains(long id) {
+return invalid.contains(id);
+  }
+
+  public boolean remove(long id) {
+boolean changed = invalid.rem(id);
+dirty = dirty || changed;
+return changed;
+  }
+
+  public boolean removeAll(Collection ids) {
+boolean changed = invalid.removeAll(ids);
+dirty = dirty || changed;
+return changed;
+  }
+
+  public void clear() {
+invalid.clear();
+invalidArray = NO_INVALID_TX;
+dirty = false;
+  }
+
+  /**
+   * @return sorted array of invalid transactions
+   */
+  public long[] toArray() {
+lazyUpdate();
+return invalidArray;
+  }
+
+  /**
+   * @return sorted list of invalid transactions
+   */
+  public List toList() {
+lazyUpdate();
+return Collections.unmodifiableList(invalid);
+  }
+
+  private void lazyUpdate() {
+if (dirty) {
+  Collections.sort(invalid);
--- End diff --

Actually there is no requirement in sorting the invalid `LongList` and it 
is more efficient to sort the `invalidArray` directly using 
`Arrays.sort(long[])` method.

`Collections.sort` first turn the list to an `Object[]` (hence boxing), 
sort the array, then set the value back to the list one by one.


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 

[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876472#comment-15876472
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user chtyim commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102284134
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/TransactionManager.java ---
@@ -1124,7 +1110,7 @@ private boolean doTruncateInvalidTxBefore(long time) 
throws InvalidTruncateTimeE
 
 // Find all invalid transactions earlier than truncateWp
 Set toTruncate = Sets.newHashSet();
--- End diff --

Actually this can be a `LongSet` so that it can be passed to the 
`doTruncateInvalidTx` and eventually to `invalidTxList.removeAll` as a 
`LongSet` to have more efficient removal.


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data as expected since new 
> snapshots are searialized using the {{List}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876474#comment-15876474
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user chtyim commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102284511
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/manager/InvalidTxList.java ---
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.tephra.manager;
+
+import it.unimi.dsi.fastutil.longs.LongArrayList;
+import org.apache.tephra.TransactionManager;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import javax.annotation.concurrent.NotThreadSafe;
+
+/**
+ * This is an internal class used by the {@link TransactionManager} to 
store invalid transaction ids.
+ * This class uses both a list and an array to keep track of the invalid 
ids. The list is the primary
+ * data structure for storing the invalid ids. The array is populated 
lazily on changes to the list.
+ * The array is used to avoid creating a new array every time method 
{@link #toArray()} is invoked.
+ *
+ * This class is not thread safe and relies on external synchronization. 
TransactionManager always
+ * accesses an instance of this class after synchronization.
+ */
+@NotThreadSafe
+public class InvalidTxList {
+  private static final long[] NO_INVALID_TX = { };
+
+  private final LongArrayList invalid = new LongArrayList();
--- End diff --

Better have the type as `LongList`.


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data as expected since new 
> snapshots are searialized using the {{List}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876475#comment-15876475
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user chtyim commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102282575
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/manager/InvalidTxList.java ---
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.tephra.manager;
+
+import it.unimi.dsi.fastutil.longs.LongArrayList;
+import org.apache.tephra.TransactionManager;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import javax.annotation.concurrent.NotThreadSafe;
+
+/**
+ * This is an internal class used by the {@link TransactionManager} to 
store invalid transaction ids.
+ * This class uses both a list and an array to keep track of the invalid 
ids. The list is the primary
+ * data structure for storing the invalid ids. The array is populated 
lazily on changes to the list.
+ * The array is used to avoid creating a new array every time method 
{@link #toArray()} is invoked.
+ *
+ * This class is not thread safe and relies on external synchronization. 
TransactionManager always
+ * accesses an instance of this class after synchronization.
+ */
+@NotThreadSafe
+public class InvalidTxList {
+  private static final long[] NO_INVALID_TX = { };
+
+  private final LongArrayList invalid = new LongArrayList();
+  private long[] invalidArray = NO_INVALID_TX;
+
+  private boolean dirty = false; // used to track changes to the invalid 
list
+
+  public int size() {
+return invalid.size();
+  }
+
+  public boolean isEmpty() {
+return invalid.isEmpty();
+  }
+
+  public boolean add(long id) {
+boolean changed = invalid.add(id);
+dirty = dirty || changed;
+return changed;
+  }
+
+  public boolean addAll(Collection ids) {
--- End diff --

the only call to this method is adding all child write pointers. It's 
better to take a `LongList` as the input so that the more efficient 
`invalid.addAll(LongList)` method can be called to avoid boxing 
(`LongArrayList` has a special implementation that use array copy instead of 
iterate + box one by one).


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data as expected since new 
> snapshots are searialized using the {{List}}



--
This message was sent by 

[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15875329#comment-15875329
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user poornachandra commented on the issue:

https://github.com/apache/incubator-tephra/pull/37
  
> Does the added synchronization add performance overhead?

Extra synchronization is added in five places in this PR -
1. canCommit() - when transaction is not in in-progress  list

2. Emitting metrics - metrics are emitted every 10 seconds

3. getInvalidSize() - called when Thrift endpoint is invoked by user or by 
monitoring scripts

4. and 5. getExcludedListSize() and logStatistics() - these methods are 
used in test or unused

The only significant usage is by canCommit() when the committing 
transaction is not in the in-progress list. I think this overhead is acceptable 
since we now sort the invalid list only when creating a transaction, as opposed 
to sorting it whenever the list changed (like we did earlier).




> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data as expected since new 
> snapshots are searialized using the {{List}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15875330#comment-15875330
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user poornachandra commented on the issue:

https://github.com/apache/incubator-tephra/pull/37
  
@anew @gokulavasan I have addressed the comments, please take another look


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data as expected since new 
> snapshots are searialized using the {{List}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15875318#comment-15875318
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user poornachandra commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102127364
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/manager/InvalidTxList.java ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.tephra.manager;
+
+import it.unimi.dsi.fastutil.longs.LongArrayList;
+import org.apache.tephra.TransactionManager;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import javax.annotation.concurrent.NotThreadSafe;
+
+/**
+ * This is an internal class used by the {@link TransactionManager} to 
store invalid transaction ids.
+ * This class uses both a list and an array to keep track of the invalid 
ids. The list is the primary
+ * data structure for storing the invalid ids. The array is populated 
lazily on changes to the list.
+ * The array is used to avoid creating a new array every time method 
{@link #toArray()} is invoked.
+ *
+ * This class is not thread safe and relies on external synchronization. 
TransactionManager always
+ * accesses an instance of this class after synchronization.
+ */
+@NotThreadSafe
+public class InvalidTxList {
+  private static final long[] NO_INVALID_TX = { };
+
+  private final LongArrayList invalid = new LongArrayList();
+  private long[] invalidArray = NO_INVALID_TX;
+
+  private boolean dirty = false; // used to track changes to the invalid 
list
+
+  public int size() {
+return invalid.size();
+  }
+
+  public boolean isEmpty() {
+return invalid.isEmpty();
+  }
+
+  public boolean add(long id) {
+boolean changed = invalid.add(id);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean addAll(Collection ids) {
+boolean changed = invalid.addAll(ids);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean contains(long id) {
+return invalid.contains(id);
+  }
+
+  public boolean remove(long id) {
+boolean changed = invalid.rem(id);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean removeAll(Collection ids) {
+boolean changed = invalid.removeAll(ids);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public void clear() {
+invalid.clear();
+invalidArray = NO_INVALID_TX;
+dirty = false;
+  }
+
+  public long[] toArray() {
+if (dirty) {
+  Collections.sort(invalid);
+  invalidArray = invalid.toLongArray();
+  dirty = false;
+}
+return invalidArray;
+  }
+
+  public List toList() {
+return Collections.unmodifiableList(invalid);
--- End diff --

The sorting behavior is consistent with the existing implementation. Also 
truncate and snapshot (the two places where toList() is used) are not invoked 
frequently. So I suggest that, we'll leave this behavior as-is for now. We can 
take a look at this again when working on the performance improvements.


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>  

[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15875301#comment-15875301
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user poornachandra commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102125886
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/manager/InvalidTxList.java ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.tephra.manager;
+
+import it.unimi.dsi.fastutil.longs.LongArrayList;
+import org.apache.tephra.TransactionManager;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import javax.annotation.concurrent.NotThreadSafe;
+
+/**
+ * This is an internal class used by the {@link TransactionManager} to 
store invalid transaction ids.
+ * This class uses both a list and an array to keep track of the invalid 
ids. The list is the primary
+ * data structure for storing the invalid ids. The array is populated 
lazily on changes to the list.
+ * The array is used to avoid creating a new array every time method 
{@link #toArray()} is invoked.
+ *
+ * This class is not thread safe and relies on external synchronization. 
TransactionManager always
+ * accesses an instance of this class after synchronization.
+ */
+@NotThreadSafe
+public class InvalidTxList {
+  private static final long[] NO_INVALID_TX = { };
+
+  private final LongArrayList invalid = new LongArrayList();
+  private long[] invalidArray = NO_INVALID_TX;
+
+  private boolean dirty = false; // used to track changes to the invalid 
list
+
+  public int size() {
+return invalid.size();
+  }
+
+  public boolean isEmpty() {
+return invalid.isEmpty();
+  }
+
+  public boolean add(long id) {
+boolean changed = invalid.add(id);
+if (changed && !dirty) {
--- End diff --

Thanks, it makes the code more readable


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data as expected since new 
> snapshots are searialized using the {{List}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15875295#comment-15875295
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user anew commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102124958
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/manager/InvalidTxList.java ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.tephra.manager;
+
+import it.unimi.dsi.fastutil.longs.LongArrayList;
+import org.apache.tephra.TransactionManager;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import javax.annotation.concurrent.NotThreadSafe;
+
+/**
+ * This is an internal class used by the {@link TransactionManager} to 
store invalid transaction ids.
+ * This class uses both a list and an array to keep track of the invalid 
ids. The list is the primary
+ * data structure for storing the invalid ids. The array is populated 
lazily on changes to the list.
+ * The array is used to avoid creating a new array every time method 
{@link #toArray()} is invoked.
+ *
+ * This class is not thread safe and relies on external synchronization. 
TransactionManager always
+ * accesses an instance of this class after synchronization.
+ */
+@NotThreadSafe
+public class InvalidTxList {
+  private static final long[] NO_INVALID_TX = { };
+
+  private final LongArrayList invalid = new LongArrayList();
+  private long[] invalidArray = NO_INVALID_TX;
+
+  private boolean dirty = false; // used to track changes to the invalid 
list
+
+  public int size() {
+return invalid.size();
+  }
+
+  public boolean isEmpty() {
+return invalid.isEmpty();
+  }
+
+  public boolean add(long id) {
+boolean changed = invalid.add(id);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean addAll(Collection ids) {
+boolean changed = invalid.addAll(ids);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean contains(long id) {
+return invalid.contains(id);
+  }
+
+  public boolean remove(long id) {
+boolean changed = invalid.rem(id);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean removeAll(Collection ids) {
+boolean changed = invalid.removeAll(ids);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public void clear() {
+invalid.clear();
+invalidArray = NO_INVALID_TX;
+dirty = false;
+  }
+
+  public long[] toArray() {
+if (dirty) {
+  Collections.sort(invalid);
+  invalidArray = invalid.toLongArray();
+  dirty = false;
+}
+return invalidArray;
+  }
+
+  public List toList() {
+return Collections.unmodifiableList(invalid);
--- End diff --

Or the caller can remove the assumption. Sorting takes n log n. Whereas the 
caller traverses the list once, which is n. Not sure it it worth sorting.


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to 

[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15875286#comment-15875286
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user anew commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102124522
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/manager/InvalidTxList.java ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.tephra.manager;
+
+import it.unimi.dsi.fastutil.longs.LongArrayList;
+import org.apache.tephra.TransactionManager;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import javax.annotation.concurrent.NotThreadSafe;
+
+/**
+ * This is an internal class used by the {@link TransactionManager} to 
store invalid transaction ids.
+ * This class uses both a list and an array to keep track of the invalid 
ids. The list is the primary
+ * data structure for storing the invalid ids. The array is populated 
lazily on changes to the list.
+ * The array is used to avoid creating a new array every time method 
{@link #toArray()} is invoked.
+ *
+ * This class is not thread safe and relies on external synchronization. 
TransactionManager always
+ * accesses an instance of this class after synchronization.
+ */
+@NotThreadSafe
+public class InvalidTxList {
+  private static final long[] NO_INVALID_TX = { };
+
+  private final LongArrayList invalid = new LongArrayList();
+  private long[] invalidArray = NO_INVALID_TX;
+
+  private boolean dirty = false; // used to track changes to the invalid 
list
+
+  public int size() {
+return invalid.size();
+  }
+
+  public boolean isEmpty() {
+return invalid.isEmpty();
+  }
+
+  public boolean add(long id) {
+boolean changed = invalid.add(id);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean addAll(Collection ids) {
+boolean changed = invalid.addAll(ids);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean contains(long id) {
+return invalid.contains(id);
+  }
+
+  public boolean remove(long id) {
+boolean changed = invalid.rem(id);
+if (changed && !dirty) {
--- End diff --

ditto


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data 

[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15875288#comment-15875288
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user anew commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102124543
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/manager/InvalidTxList.java ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.tephra.manager;
+
+import it.unimi.dsi.fastutil.longs.LongArrayList;
+import org.apache.tephra.TransactionManager;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import javax.annotation.concurrent.NotThreadSafe;
+
+/**
+ * This is an internal class used by the {@link TransactionManager} to 
store invalid transaction ids.
+ * This class uses both a list and an array to keep track of the invalid 
ids. The list is the primary
+ * data structure for storing the invalid ids. The array is populated 
lazily on changes to the list.
+ * The array is used to avoid creating a new array every time method 
{@link #toArray()} is invoked.
+ *
+ * This class is not thread safe and relies on external synchronization. 
TransactionManager always
+ * accesses an instance of this class after synchronization.
+ */
+@NotThreadSafe
+public class InvalidTxList {
+  private static final long[] NO_INVALID_TX = { };
+
+  private final LongArrayList invalid = new LongArrayList();
+  private long[] invalidArray = NO_INVALID_TX;
+
+  private boolean dirty = false; // used to track changes to the invalid 
list
+
+  public int size() {
+return invalid.size();
+  }
+
+  public boolean isEmpty() {
+return invalid.isEmpty();
+  }
+
+  public boolean add(long id) {
+boolean changed = invalid.add(id);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean addAll(Collection ids) {
+boolean changed = invalid.addAll(ids);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean contains(long id) {
+return invalid.contains(id);
+  }
+
+  public boolean remove(long id) {
+boolean changed = invalid.rem(id);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean removeAll(Collection ids) {
+boolean changed = invalid.removeAll(ids);
+if (changed && !dirty) {
--- End diff --

ditto


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} 

[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15875285#comment-15875285
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user anew commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102124470
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/manager/InvalidTxList.java ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.tephra.manager;
+
+import it.unimi.dsi.fastutil.longs.LongArrayList;
+import org.apache.tephra.TransactionManager;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import javax.annotation.concurrent.NotThreadSafe;
+
+/**
+ * This is an internal class used by the {@link TransactionManager} to 
store invalid transaction ids.
+ * This class uses both a list and an array to keep track of the invalid 
ids. The list is the primary
+ * data structure for storing the invalid ids. The array is populated 
lazily on changes to the list.
+ * The array is used to avoid creating a new array every time method 
{@link #toArray()} is invoked.
+ *
+ * This class is not thread safe and relies on external synchronization. 
TransactionManager always
+ * accesses an instance of this class after synchronization.
+ */
+@NotThreadSafe
+public class InvalidTxList {
+  private static final long[] NO_INVALID_TX = { };
+
+  private final LongArrayList invalid = new LongArrayList();
+  private long[] invalidArray = NO_INVALID_TX;
+
+  private boolean dirty = false; // used to track changes to the invalid 
list
+
+  public int size() {
+return invalid.size();
+  }
+
+  public boolean isEmpty() {
+return invalid.isEmpty();
+  }
+
+  public boolean add(long id) {
+boolean changed = invalid.add(id);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean addAll(Collection ids) {
+boolean changed = invalid.addAll(ids);
+if (changed && !dirty) {
--- End diff --

ditto


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data as expected since new 
> snapshots are searialized using the {{List}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15875262#comment-15875262
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user poornachandra commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102123021
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/manager/InvalidTxList.java ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.tephra.manager;
+
+import it.unimi.dsi.fastutil.longs.LongArrayList;
+import org.apache.tephra.TransactionManager;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import javax.annotation.concurrent.NotThreadSafe;
+
+/**
+ * This is an internal class used by the {@link TransactionManager} to 
store invalid transaction ids.
+ * This class uses both a list and an array to keep track of the invalid 
ids. The list is the primary
+ * data structure for storing the invalid ids. The array is populated 
lazily on changes to the list.
+ * The array is used to avoid creating a new array every time method 
{@link #toArray()} is invoked.
+ *
+ * This class is not thread safe and relies on external synchronization. 
TransactionManager always
+ * accesses an instance of this class after synchronization.
+ */
+@NotThreadSafe
+public class InvalidTxList {
+  private static final long[] NO_INVALID_TX = { };
+
+  private final LongArrayList invalid = new LongArrayList();
+  private long[] invalidArray = NO_INVALID_TX;
+
+  private boolean dirty = false; // used to track changes to the invalid 
list
+
+  public int size() {
+return invalid.size();
+  }
+
+  public boolean isEmpty() {
+return invalid.isEmpty();
+  }
+
+  public boolean add(long id) {
+boolean changed = invalid.add(id);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean addAll(Collection ids) {
+boolean changed = invalid.addAll(ids);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean contains(long id) {
+return invalid.contains(id);
+  }
+
+  public boolean remove(long id) {
+boolean changed = invalid.rem(id);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean removeAll(Collection ids) {
+boolean changed = invalid.removeAll(ids);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public void clear() {
+invalid.clear();
+invalidArray = NO_INVALID_TX;
+dirty = false;
+  }
+
+  public long[] toArray() {
+if (dirty) {
+  Collections.sort(invalid);
+  invalidArray = invalid.toLongArray();
+  dirty = false;
+}
+return invalidArray;
+  }
+
+  public List toList() {
+return Collections.unmodifiableList(invalid);
--- End diff --

Good catch. I'll update InvalidTxList to lazily sort both the list and the 
array.


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All 

[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15875258#comment-15875258
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

Github user gokulavasan commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/37#discussion_r102122535
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/manager/InvalidTxList.java ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.tephra.manager;
+
+import it.unimi.dsi.fastutil.longs.LongArrayList;
+import org.apache.tephra.TransactionManager;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import javax.annotation.concurrent.NotThreadSafe;
+
+/**
+ * This is an internal class used by the {@link TransactionManager} to 
store invalid transaction ids.
+ * This class uses both a list and an array to keep track of the invalid 
ids. The list is the primary
+ * data structure for storing the invalid ids. The array is populated 
lazily on changes to the list.
+ * The array is used to avoid creating a new array every time method 
{@link #toArray()} is invoked.
+ *
+ * This class is not thread safe and relies on external synchronization. 
TransactionManager always
+ * accesses an instance of this class after synchronization.
+ */
+@NotThreadSafe
+public class InvalidTxList {
+  private static final long[] NO_INVALID_TX = { };
+
+  private final LongArrayList invalid = new LongArrayList();
+  private long[] invalidArray = NO_INVALID_TX;
+
+  private boolean dirty = false; // used to track changes to the invalid 
list
+
+  public int size() {
+return invalid.size();
+  }
+
+  public boolean isEmpty() {
+return invalid.isEmpty();
+  }
+
+  public boolean add(long id) {
+boolean changed = invalid.add(id);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean addAll(Collection ids) {
+boolean changed = invalid.addAll(ids);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean contains(long id) {
+return invalid.contains(id);
+  }
+
+  public boolean remove(long id) {
+boolean changed = invalid.rem(id);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public boolean removeAll(Collection ids) {
+boolean changed = invalid.removeAll(ids);
+if (changed && !dirty) {
+  dirty = true;
+}
+return changed;
+  }
+
+  public void clear() {
+invalid.clear();
+invalidArray = NO_INVALID_TX;
+dirty = false;
+  }
+
+  public long[] toArray() {
+if (dirty) {
+  Collections.sort(invalid);
+  invalidArray = invalid.toLongArray();
+  dirty = false;
+}
+return invalidArray;
+  }
+
+  public List toList() {
+return Collections.unmodifiableList(invalid);
--- End diff --

This won't be sorted right? But the comment on line 1114 of 
TransactionManager seems to assume that this method return a sorted list?


> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the 

[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15875250#comment-15875250
 ] 

ASF GitHub Bot commented on TEPHRA-223:
---

GitHub user poornachandra opened a pull request:

https://github.com/apache/incubator-tephra/pull/37

TEPHRA-223 Encapsulate the two data structures used for invalid 
transactions to avoid update issues

JIRA - https://issues.apache.org/jira/browse/TEPHRA-223

Approach:
Encapsulate the invalid list and the invalidArray data structures into a 
class InvalidTxList. InvalidTxList tracks the changes to invalid list, and then 
populates invalidArray lazily when the list changes.

TODO: Add tests for class InvalidTxList

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/poornachandra/incubator-tephra 
feature/invalid-persistence

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-tephra/pull/37.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #37


commit 41f293e6f23a307a7709aba0c3a625abb29f9d4d
Author: poorna 
Date:   2017-02-21T01:13:39Z

TEPHRA-223 Encapsulate the two data structures used for invalid 
transactions to avoid update issues

commit ff2996916307ee49c215aba9758b03ad7b318db9
Author: poorna 
Date:   2017-02-21T01:52:20Z

TEPHRA-223 Synchronize the access to invalid list




> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data as expected since new 
> snapshots are searialized using the {{List}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TEPHRA-223) Transactions started after a snapshot restore can have incorrect invalid transaction list

2017-02-20 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15875106#comment-15875106
 ] 

James Taylor commented on TEPHRA-223:
-

Agree, [~poornachandra]. I think we should spin up a new 0.11 RC with a fix.

> Transactions started after a snapshot restore can have incorrect invalid 
> transaction list
> -
>
> Key: TEPHRA-223
> URL: https://issues.apache.org/jira/browse/TEPHRA-223
> Project: Tephra
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.5.0, 0.6.5, 0.7.0, 0.8.0-incubating, 0.9.0-incubating, 
> 0.10.0-incubating
>Reporter: Poorna Chandra
>Assignee: Poorna Chandra
> Fix For: 0.12.0-incubating
>
>
> Transaction Manager uses two datastructures to manage the invalid list - a 
> {{List}} and an {{array}}. All updates to invalid list happen to the 
> {{List}}, and the list is then sorted and copied over to the {{array}}. This 
> is an optimization done so as to not sort the invalid list every time a new 
> transaction is created.
> However during a snapshot restore after the Transaction Manager starts up, 
> the invalid list gets correctly restored to the {{List}} but does not get 
> copied over to the {{array}}. This will lead to transactions started right 
> after a snapshot restore to have empty invalid list. This will make invalid 
> data to become visible to those transactions.
> Since the {{List}} still contains the right invalid list, any update to the 
> invalid list - like invalidating a transaction,  will restore the {{array}} 
> back to a good state.
> Also compactions will still remove invalid data as expected since new 
> snapshots are searialized using the {{List}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)