[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-04-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983584#comment-15983584
 ] 

ASF GitHub Bot commented on GEODE-2398:
---

Github user asfgit closed the pull request at:

https://github.com/apache/geode/pull/477


> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Anilkumar Gingade
> Fix For: 1.2.0
>
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-04-25 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983581#comment-15983581
 ] 

ASF subversion and git services commented on GEODE-2398:


Commit a3434e29976a7d54957878a074806d016eb59234 in geode's branch 
refs/heads/develop from [~lgallinat]
[ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=a3434e2 ]

GEODE-2398: fix oplog corruption in overflow oplogs

* ported changes from original fix in Oplog.java to OverflowOplog.java

This closes #477


> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Anilkumar Gingade
> Fix For: 1.2.0
>
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-04-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983360#comment-15983360
 ] 

ASF GitHub Bot commented on GEODE-2398:
---

Github user lgallinat commented on the issue:

https://github.com/apache/geode/pull/477
  
Anil, I have made your suggested change and pushed it to the feature branch.


> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Anilkumar Gingade
> Fix For: 1.2.0
>
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-04-25 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983332#comment-15983332
 ] 

ASF subversion and git services commented on GEODE-2398:


Commit 5697f8c6f6d7e3a0272fc86be6d2c04ba87d9013 in geode's branch 
refs/heads/feature/GEODE-2398overflow from [~lgallinat]
[ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=5697f8c ]

GEODE-2398: fix oplog corruption in overflow oplogs

* ported changes from original fix in Oplog.java to OverflowOplog.java


> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Anilkumar Gingade
> Fix For: 1.2.0
>
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-04-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982194#comment-15982194
 ] 

ASF GitHub Bot commented on GEODE-2398:
---

Github user agingade commented on a diff in the pull request:

https://github.com/apache/geode/pull/477#discussion_r113087379
  
--- Diff: 
geode-core/src/main/java/org/apache/geode/internal/cache/OverflowOplog.java ---
@@ -724,8 +727,31 @@ public final void flush() throws IOException {
 if (bb != null && bb.position() != 0) {
   bb.flip();
   int flushed = 0;
+  int numChannelRetries = 0;
   do {
-flushed += olf.channel.write(bb);
+int channelBytesWritten = 0;
+final int bbStartPos = bb.position();
+final long channelStartPos = olf.channel.position();
+// differentiate between bytes written on this channel.write() 
iteration and the
+// total number of bytes written to the channel on this call
+channelBytesWritten += olf.channel.write(bb);
--- End diff --

Instead of "+="; we could just assign the value...Its not really makes any 
difference; its just when you read this line, you don't have to know its 
previous value...


> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Anilkumar Gingade
> Fix For: 1.2.0
>
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-04-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982095#comment-15982095
 ] 

ASF GitHub Bot commented on GEODE-2398:
---

GitHub user lgallinat opened a pull request:

https://github.com/apache/geode/pull/477

GEODE-2398: fix oplog corruption in overflow oplogs

* ported changes from original fix in Oplog.java to 
OverflowOplog.java 

@agingade @dschneider-pivotal @pivotal-eshu @pdxrunner 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/geode feature/GEODE-2398overflow

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/geode/pull/477.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #477






> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Anilkumar Gingade
> Fix For: 1.2.0
>
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-04-24 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982042#comment-15982042
 ] 

ASF subversion and git services commented on GEODE-2398:


Commit 7546b57dce91c079714a6fc0ef37785a4c2799cc in geode's branch 
refs/heads/feature/GEODE-2398overflow from [~lgallinat]
[ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=7546b57 ]

GEODE-2398: fix oplog corruption in overflow oplogs

* ported changes from original fix in Oplog.java to OverflowOplog.java


> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Anilkumar Gingade
> Fix For: 1.2.0
>
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-02-15 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868350#comment-15868350
 ] 

ASF subversion and git services commented on GEODE-2398:


Commit fb14e9aab263654ed0176dcc3c9738be1b208a82 in geode's branch 
refs/heads/feature/GEODE-2449 from [~khowe]
[ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=fb14e9a ]

GEODE-2398: Updates from review

https://reviews.apache.org/r/56506/


> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Kenneth Howe
> Fix For: 1.2.0
>
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-02-15 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868348#comment-15868348
 ] 

ASF subversion and git services commented on GEODE-2398:


Commit 9b0f16570aad4abc82b71d0d16167a9774449d41 in geode's branch 
refs/heads/feature/GEODE-2449 from [~khowe]
[ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=9b0f165 ]

GEODE-2398: Retry oplog channel.write on silent failures

Implemented limited retries in two forms of Oplog.flush() when channel.write() 
is called.
If write() returns bytes witten less than the change in the ByteBuffer 
positions, then reset
buffer positions and re-try writing for a liomited number of times. Throws
IOException if the write doesn't succeeded after a few retries (max
number of retries is defined by a static).

Added new unit tests.


> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Kenneth Howe
> Fix For: 1.2.0
>
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-02-14 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866600#comment-15866600
 ] 

ASF subversion and git services commented on GEODE-2398:


Commit 9b0f16570aad4abc82b71d0d16167a9774449d41 in geode's branch 
refs/heads/feature/GEODE-2402 from [~khowe]
[ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=9b0f165 ]

GEODE-2398: Retry oplog channel.write on silent failures

Implemented limited retries in two forms of Oplog.flush() when channel.write() 
is called.
If write() returns bytes witten less than the change in the ByteBuffer 
positions, then reset
buffer positions and re-try writing for a liomited number of times. Throws
IOException if the write doesn't succeeded after a few retries (max
number of retries is defined by a static).

Added new unit tests.


> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Kenneth Howe
> Fix For: 1.2.0
>
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-02-14 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866601#comment-15866601
 ] 

ASF subversion and git services commented on GEODE-2398:


Commit fb14e9aab263654ed0176dcc3c9738be1b208a82 in geode's branch 
refs/heads/feature/GEODE-2402 from [~khowe]
[ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=fb14e9a ]

GEODE-2398: Updates from review

https://reviews.apache.org/r/56506/


> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Kenneth Howe
> Fix For: 1.2.0
>
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-02-14 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866418#comment-15866418
 ] 

ASF subversion and git services commented on GEODE-2398:


Commit 9b0f16570aad4abc82b71d0d16167a9774449d41 in geode's branch 
refs/heads/feature/GEODE-2267 from [~khowe]
[ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=9b0f165 ]

GEODE-2398: Retry oplog channel.write on silent failures

Implemented limited retries in two forms of Oplog.flush() when channel.write() 
is called.
If write() returns bytes witten less than the change in the ByteBuffer 
positions, then reset
buffer positions and re-try writing for a liomited number of times. Throws
IOException if the write doesn't succeeded after a few retries (max
number of retries is defined by a static).

Added new unit tests.


> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Kenneth Howe
> Fix For: 1.2.0
>
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-02-14 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866419#comment-15866419
 ] 

ASF subversion and git services commented on GEODE-2398:


Commit fb14e9aab263654ed0176dcc3c9738be1b208a82 in geode's branch 
refs/heads/feature/GEODE-2267 from [~khowe]
[ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=fb14e9a ]

GEODE-2398: Updates from review

https://reviews.apache.org/r/56506/


> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Kenneth Howe
> Fix For: 1.2.0
>
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-02-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865063#comment-15865063
 ] 

ASF subversion and git services commented on GEODE-2398:


Commit 9b0f16570aad4abc82b71d0d16167a9774449d41 in geode's branch 
refs/heads/feature/GEODE-2474 from [~khowe]
[ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=9b0f165 ]

GEODE-2398: Retry oplog channel.write on silent failures

Implemented limited retries in two forms of Oplog.flush() when channel.write() 
is called.
If write() returns bytes witten less than the change in the ByteBuffer 
positions, then reset
buffer positions and re-try writing for a liomited number of times. Throws
IOException if the write doesn't succeeded after a few retries (max
number of retries is defined by a static).

Added new unit tests.


> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Kenneth Howe
> Fix For: 1.2.0
>
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-02-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865064#comment-15865064
 ] 

ASF subversion and git services commented on GEODE-2398:


Commit fb14e9aab263654ed0176dcc3c9738be1b208a82 in geode's branch 
refs/heads/feature/GEODE-2474 from [~khowe]
[ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=fb14e9a ]

GEODE-2398: Updates from review

https://reviews.apache.org/r/56506/


> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Kenneth Howe
> Fix For: 1.2.0
>
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-02-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15863815#comment-15863815
 ] 

ASF subversion and git services commented on GEODE-2398:


Commit aacfa0685d6e8f1043cb485bbd7182a71e444043 in geode's branch 
refs/heads/feature/GEODE-2398 from [~khowe]
[ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=aacfa06 ]

GEODE-2398: Updates from review

https://reviews.apache.org/r/56506/


> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Kenneth Howe
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-02-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859927#comment-15859927
 ] 

ASF subversion and git services commented on GEODE-2398:


Commit 451c5b497662485bfb94b7f7afaacfd2cd82d043 in geode's branch 
refs/heads/feature/GEODE-2398 from [~khowe]
[ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=451c5b4 ]

GEODE-2398: Retry oplog channel.write on silent failures

Implemented limited retries in two forms of Oplog.flush() when channel.write() 
is called.
If write() returns bytes witten less than the change in the ByteBuffer 
positions, then reset
buffer positions and re-try writing for a liomited number of times. Throws
IOException if the write doesn't succeeded after a few retries (max
number of retries is defined by a static).

Added new unit tests.


> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Kenneth Howe
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GEODE-2398) Sporadic Oplog corruption due to channel.write failure

2017-02-09 Thread Kenneth Howe (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859759#comment-15859759
 ] 

Kenneth Howe commented on GEODE-2398:
-

This problem occurred writing to the channel from within the method 
Oplog.flush(OplogFile olf, boolean doSync). There is also a channel write 
executed from within Oplog.flush(OplogFile olf, ByteBuffer b1, ByteBuffer b2). 
The second form of flush calls channel.write(ByteBuffer[] bbArray) instead of 
channel.write(ByteBuffer bb) as in the first form. Since the write has been 
seen to fail in the first form, there's presumably a remote chance of a similar 
failure in the second form.

The fix for this problem is to add a retry loop around the channel.write calls 
conditional on the number of bytes written returned by write() being consistent 
with the change in ByteBuffer positions. The number of retries is limited to a 
small number to prevent a hard failure causing a thread to hang. IOException is 
thrown if the retry limit is exceeded.

> Sporadic Oplog corruption due to channel.write failure
> --
>
> Key: GEODE-2398
> URL: https://issues.apache.org/jira/browse/GEODE-2398
> Project: Geode
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kenneth Howe
>Assignee: Kenneth Howe
>
> There have been some occurrences of Oplog corruption during testing that have 
> been traced to failures in writing oplog entries to the .crf file. When it 
> fails, Oplog.flush attempts to write a ByteBuffer to the file channel. The 
> call to channel.write(bb) method returns 0 bytes written, but the source 
> ByteBuffer position is moved to the ByteBuffer limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)