[jira] [Work logged] (ARTEMIS-2513) Large message's copy may be interfered by other threads
[ https://issues.apache.org/jira/browse/ARTEMIS-2513?focusedWorklogId=328092=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328092 ] ASF GitHub Bot logged work on ARTEMIS-2513: --- Author: ASF GitHub Bot Created on: 14/Oct/19 19:59 Start Date: 14/Oct/19 19:59 Worklog Time Spent: 10m Work Description: asfgit commented on pull request #2859: ARTEMIS-2513 Large message's copy may be interfered by other threads URL: https://github.com/apache/activemq-artemis/pull/2859 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328092) Time Spent: 2h 20m (was: 2h 10m) > Large message's copy may be interfered by other threads > --- > > Key: ARTEMIS-2513 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2513 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.10.1 >Reporter: Howard Gao >Assignee: Howard Gao >Priority: Major > Fix For: 2.11.0 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > In LargeMessageImpl.copy(long) it need to open the underlying file in order > to read and copy bytes into the new copied message. However there is a chance > that another thread can come in and close the file in the middle, making the > copy failed with "channel is null" error. > This is happening in cases where a large message is sent to a jms topic > (multicast address). During delivery it to multiple subscribers, some > consumer is doing delivery and closed the underlying file after. Some other > consumer is rolling back the messages and eventually move it to DLQ (which > will call the above copy method). So there is a chance this bug being hit on. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (ARTEMIS-2513) Large message's copy may be interfered by other threads
[ https://issues.apache.org/jira/browse/ARTEMIS-2513?focusedWorklogId=328090=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328090 ] ASF GitHub Bot logged work on ARTEMIS-2513: --- Author: ASF GitHub Bot Created on: 14/Oct/19 19:58 Start Date: 14/Oct/19 19:58 Worklog Time Spent: 10m Work Description: clebertsuconic commented on pull request #2859: ARTEMIS-2513 Large message's copy may be interfered by other threads URL: https://github.com/apache/activemq-artemis/pull/2859#discussion_r334639588 ## File path: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/LargeServerMessageImpl.java ## @@ -368,52 +368,48 @@ public Message copy(final long newID) { try { LargeServerMessage newMessage = storageManager.createLargeMessage(newID, this); - boolean originallyOpen = file != null && file.isOpen(); + //clone a SequentialFile to avoid concurrent access + ensureFileExists(false); + SequentialFile cloneFile = file.cloneFile(); - validateFile(); - - byte[] bufferBytes = new byte[100 * 1024]; - - ByteBuffer buffer = ByteBuffer.wrap(bufferBytes); + try { +byte[] bufferBytes = new byte[100 * 1024]; - long oldPosition = file.position(); +ByteBuffer buffer = ByteBuffer.wrap(bufferBytes); - if (!file.isOpen()) { -file.open(); - } - file.position(0); - - for (;;) { -// The buffer is reused... -// We need to make sure we clear the limits and the buffer before reusing it -buffer.clear(); -int bytesRead = file.read(buffer); - -byte[] bufferToWrite; -if (bytesRead <= 0) { - break; -} else if (bytesRead == bufferBytes.length && !this.storageManager.isReplicated()) { - // ARTEMIS-1220: We cannot reuse the same buffer if it's replicated - // otherwise there could be another thread still using the buffer on a - // replication. - bufferToWrite = bufferBytes; -} else { - bufferToWrite = new byte[bytesRead]; - System.arraycopy(bufferBytes, 0, bufferToWrite, 0, bytesRead); +if (!cloneFile.isOpen()) { + cloneFile.open(); } -newMessage.addBytes(bufferToWrite); - -if (bytesRead < bufferBytes.length) { - break; +cloneFile.position(0); + +for (;;) { + // The buffer is reused... + // We need to make sure we clear the limits and the buffer before reusing it + buffer.clear(); + int bytesRead = cloneFile.read(buffer); + + byte[] bufferToWrite; + if (bytesRead <= 0) { + break; + } else if (bytesRead == bufferBytes.length && !this.storageManager.isReplicated()) { + // ARTEMIS-1220: We cannot reuse the same buffer if it's replicated + // otherwise there could be another thread still using the buffer on a + // replication. + bufferToWrite = bufferBytes; + } else { + bufferToWrite = new byte[bytesRead]; + System.arraycopy(bufferBytes, 0, bufferToWrite, 0, bytesRead); + } + + newMessage.addBytes(bufferToWrite); + + if (bytesRead < bufferBytes.length) { + break; + } } - } - - file.position(oldPosition); - - if (!originallyOpen) { -file.close(false); -newMessage.getFile().close(); + } finally { +cloneFile.close(); } Review comment: @wy96f / @gaohoward actually.. I will merge this right away.. as whatever I do would clash here. and I will make sure I address how we open and close files. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328090) Time Spent: 2h 10m (was: 2h) > Large message's copy may be interfered by other threads > --- > > Key: ARTEMIS-2513 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2513 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.10.1 >Reporter: Howard Gao >
[jira] [Work logged] (ARTEMIS-2513) Large message's copy may be interfered by other threads
[ https://issues.apache.org/jira/browse/ARTEMIS-2513?focusedWorklogId=328089=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328089 ] ASF GitHub Bot logged work on ARTEMIS-2513: --- Author: ASF GitHub Bot Created on: 14/Oct/19 19:57 Start Date: 14/Oct/19 19:57 Worklog Time Spent: 10m Work Description: clebertsuconic commented on pull request #2859: ARTEMIS-2513 Large message's copy may be interfered by other threads URL: https://github.com/apache/activemq-artemis/pull/2859#discussion_r334639212 ## File path: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/LargeServerMessageImpl.java ## @@ -368,52 +368,48 @@ public Message copy(final long newID) { try { LargeServerMessage newMessage = storageManager.createLargeMessage(newID, this); - boolean originallyOpen = file != null && file.isOpen(); + //clone a SequentialFile to avoid concurrent access + ensureFileExists(false); + SequentialFile cloneFile = file.cloneFile(); - validateFile(); - - byte[] bufferBytes = new byte[100 * 1024]; - - ByteBuffer buffer = ByteBuffer.wrap(bufferBytes); + try { +byte[] bufferBytes = new byte[100 * 1024]; - long oldPosition = file.position(); +ByteBuffer buffer = ByteBuffer.wrap(bufferBytes); - if (!file.isOpen()) { -file.open(); - } - file.position(0); - - for (;;) { -// The buffer is reused... -// We need to make sure we clear the limits and the buffer before reusing it -buffer.clear(); -int bytesRead = file.read(buffer); - -byte[] bufferToWrite; -if (bytesRead <= 0) { - break; -} else if (bytesRead == bufferBytes.length && !this.storageManager.isReplicated()) { - // ARTEMIS-1220: We cannot reuse the same buffer if it's replicated - // otherwise there could be another thread still using the buffer on a - // replication. - bufferToWrite = bufferBytes; -} else { - bufferToWrite = new byte[bytesRead]; - System.arraycopy(bufferBytes, 0, bufferToWrite, 0, bytesRead); +if (!cloneFile.isOpen()) { + cloneFile.open(); } -newMessage.addBytes(bufferToWrite); - -if (bytesRead < bufferBytes.length) { - break; +cloneFile.position(0); + +for (;;) { + // The buffer is reused... + // We need to make sure we clear the limits and the buffer before reusing it + buffer.clear(); + int bytesRead = cloneFile.read(buffer); + + byte[] bufferToWrite; + if (bytesRead <= 0) { + break; + } else if (bytesRead == bufferBytes.length && !this.storageManager.isReplicated()) { + // ARTEMIS-1220: We cannot reuse the same buffer if it's replicated + // otherwise there could be another thread still using the buffer on a + // replication. + bufferToWrite = bufferBytes; + } else { + bufferToWrite = new byte[bytesRead]; + System.arraycopy(bufferBytes, 0, bufferToWrite, 0, bytesRead); + } + + newMessage.addBytes(bufferToWrite); + + if (bytesRead < bufferBytes.length) { + break; + } } - } - - file.position(oldPosition); - - if (!originallyOpen) { -file.close(false); -newMessage.getFile().close(); + } finally { +cloneFile.close(); } Review comment: @wy96f I'm doing some work with large messages, and I want to double check that as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328089) Time Spent: 2h (was: 1h 50m) > Large message's copy may be interfered by other threads > --- > > Key: ARTEMIS-2513 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2513 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.10.1 >Reporter: Howard Gao >Assignee: Howard Gao >Priority: Major > Fix For:
[jira] [Work logged] (ARTEMIS-2513) Large message's copy may be interfered by other threads
[ https://issues.apache.org/jira/browse/ARTEMIS-2513?focusedWorklogId=328088=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328088 ] ASF GitHub Bot logged work on ARTEMIS-2513: --- Author: ASF GitHub Bot Created on: 14/Oct/19 19:57 Start Date: 14/Oct/19 19:57 Worklog Time Spent: 10m Work Description: clebertsuconic commented on issue #2859: ARTEMIS-2513 Large message's copy may be interfered by other threads URL: https://github.com/apache/activemq-artemis/pull/2859#issuecomment-541887630 @gaohoward can I keep this open for 1 or 2 days? I'm doing some work on large messages and I want to make sure why we have that code to close or keep files open. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328088) Time Spent: 1h 50m (was: 1h 40m) > Large message's copy may be interfered by other threads > --- > > Key: ARTEMIS-2513 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2513 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.10.1 >Reporter: Howard Gao >Assignee: Howard Gao >Priority: Major > Fix For: 2.11.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > In LargeMessageImpl.copy(long) it need to open the underlying file in order > to read and copy bytes into the new copied message. However there is a chance > that another thread can come in and close the file in the middle, making the > copy failed with "channel is null" error. > This is happening in cases where a large message is sent to a jms topic > (multicast address). During delivery it to multiple subscribers, some > consumer is doing delivery and closed the underlying file after. Some other > consumer is rolling back the messages and eventually move it to DLQ (which > will call the above copy method). So there is a chance this bug being hit on. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (ARTEMIS-2513) Large message's copy may be interfered by other threads
[ https://issues.apache.org/jira/browse/ARTEMIS-2513?focusedWorklogId=326153=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326153 ] ASF GitHub Bot logged work on ARTEMIS-2513: --- Author: ASF GitHub Bot Created on: 10/Oct/19 07:28 Start Date: 10/Oct/19 07:28 Worklog Time Spent: 10m Work Description: gaohoward commented on issue #2859: ARTEMIS-2513 Large message's copy may be interfered by other threads URL: https://github.com/apache/activemq-artemis/pull/2859#issuecomment-540432702 @clebertsuconic Jenkins has 5 failures, not likely relevant and those failed tests are passing on my local. So it's safe now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 326153) Time Spent: 1h 40m (was: 1.5h) > Large message's copy may be interfered by other threads > --- > > Key: ARTEMIS-2513 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2513 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.10.1 >Reporter: Howard Gao >Assignee: Howard Gao >Priority: Major > Fix For: 2.11.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > In LargeMessageImpl.copy(long) it need to open the underlying file in order > to read and copy bytes into the new copied message. However there is a chance > that another thread can come in and close the file in the middle, making the > copy failed with "channel is null" error. > This is happening in cases where a large message is sent to a jms topic > (multicast address). During delivery it to multiple subscribers, some > consumer is doing delivery and closed the underlying file after. Some other > consumer is rolling back the messages and eventually move it to DLQ (which > will call the above copy method). So there is a chance this bug being hit on. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (ARTEMIS-2513) Large message's copy may be interfered by other threads
[ https://issues.apache.org/jira/browse/ARTEMIS-2513?focusedWorklogId=326085=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326085 ] ASF GitHub Bot logged work on ARTEMIS-2513: --- Author: ASF GitHub Bot Created on: 10/Oct/19 02:55 Start Date: 10/Oct/19 02:55 Worklog Time Spent: 10m Work Description: gaohoward commented on pull request #2859: ARTEMIS-2513 Large message's copy may be interfered by other threads URL: https://github.com/apache/activemq-artemis/pull/2859#discussion_r11392 ## File path: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/LargeServerMessageImpl.java ## @@ -368,52 +368,48 @@ public Message copy(final long newID) { try { LargeServerMessage newMessage = storageManager.createLargeMessage(newID, this); - boolean originallyOpen = file != null && file.isOpen(); + //clone a SequentialFile to avoid concurrent access + ensureFileExists(false); + SequentialFile cloneFile = file.cloneFile(); - validateFile(); - - byte[] bufferBytes = new byte[100 * 1024]; - - ByteBuffer buffer = ByteBuffer.wrap(bufferBytes); + try { +byte[] bufferBytes = new byte[100 * 1024]; - long oldPosition = file.position(); +ByteBuffer buffer = ByteBuffer.wrap(bufferBytes); - if (!file.isOpen()) { -file.open(); - } - file.position(0); - - for (;;) { -// The buffer is reused... -// We need to make sure we clear the limits and the buffer before reusing it -buffer.clear(); -int bytesRead = file.read(buffer); - -byte[] bufferToWrite; -if (bytesRead <= 0) { - break; -} else if (bytesRead == bufferBytes.length && !this.storageManager.isReplicated()) { - // ARTEMIS-1220: We cannot reuse the same buffer if it's replicated - // otherwise there could be another thread still using the buffer on a - // replication. - bufferToWrite = bufferBytes; -} else { - bufferToWrite = new byte[bytesRead]; - System.arraycopy(bufferBytes, 0, bufferToWrite, 0, bytesRead); +if (!cloneFile.isOpen()) { + cloneFile.open(); } -newMessage.addBytes(bufferToWrite); - -if (bytesRead < bufferBytes.length) { - break; +cloneFile.position(0); + +for (;;) { + // The buffer is reused... + // We need to make sure we clear the limits and the buffer before reusing it + buffer.clear(); + int bytesRead = cloneFile.read(buffer); + + byte[] bufferToWrite; + if (bytesRead <= 0) { + break; + } else if (bytesRead == bufferBytes.length && !this.storageManager.isReplicated()) { + // ARTEMIS-1220: We cannot reuse the same buffer if it's replicated + // otherwise there could be another thread still using the buffer on a + // replication. + bufferToWrite = bufferBytes; + } else { + bufferToWrite = new byte[bytesRead]; + System.arraycopy(bufferBytes, 0, bufferToWrite, 0, bytesRead); + } + + newMessage.addBytes(bufferToWrite); + + if (bytesRead < bufferBytes.length) { + break; + } } - } - - file.position(oldPosition); - - if (!originallyOpen) { -file.close(false); -newMessage.getFile().close(); + } finally { +cloneFile.close(); } Review comment: tbh I don't know either. I just keep the old behavior. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 326085) Time Spent: 1.5h (was: 1h 20m) > Large message's copy may be interfered by other threads > --- > > Key: ARTEMIS-2513 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2513 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.10.1 >Reporter: Howard Gao >Assignee: Howard Gao >Priority: Major > Fix For: 2.11.0 > > Time Spent: 1.5h >
[jira] [Work logged] (ARTEMIS-2513) Large message's copy may be interfered by other threads
[ https://issues.apache.org/jira/browse/ARTEMIS-2513?focusedWorklogId=326067=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326067 ] ASF GitHub Bot logged work on ARTEMIS-2513: --- Author: ASF GitHub Bot Created on: 10/Oct/19 02:28 Start Date: 10/Oct/19 02:28 Worklog Time Spent: 10m Work Description: wy96f commented on pull request #2859: ARTEMIS-2513 Large message's copy may be interfered by other threads URL: https://github.com/apache/activemq-artemis/pull/2859#discussion_r06857 ## File path: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/LargeServerMessageImpl.java ## @@ -368,52 +368,48 @@ public Message copy(final long newID) { try { LargeServerMessage newMessage = storageManager.createLargeMessage(newID, this); - boolean originallyOpen = file != null && file.isOpen(); + //clone a SequentialFile to avoid concurrent access + ensureFileExists(false); + SequentialFile cloneFile = file.cloneFile(); - validateFile(); - - byte[] bufferBytes = new byte[100 * 1024]; - - ByteBuffer buffer = ByteBuffer.wrap(bufferBytes); + try { +byte[] bufferBytes = new byte[100 * 1024]; - long oldPosition = file.position(); +ByteBuffer buffer = ByteBuffer.wrap(bufferBytes); - if (!file.isOpen()) { -file.open(); - } - file.position(0); - - for (;;) { -// The buffer is reused... -// We need to make sure we clear the limits and the buffer before reusing it -buffer.clear(); -int bytesRead = file.read(buffer); - -byte[] bufferToWrite; -if (bytesRead <= 0) { - break; -} else if (bytesRead == bufferBytes.length && !this.storageManager.isReplicated()) { - // ARTEMIS-1220: We cannot reuse the same buffer if it's replicated - // otherwise there could be another thread still using the buffer on a - // replication. - bufferToWrite = bufferBytes; -} else { - bufferToWrite = new byte[bytesRead]; - System.arraycopy(bufferBytes, 0, bufferToWrite, 0, bytesRead); +if (!cloneFile.isOpen()) { + cloneFile.open(); } -newMessage.addBytes(bufferToWrite); - -if (bytesRead < bufferBytes.length) { - break; +cloneFile.position(0); + +for (;;) { + // The buffer is reused... + // We need to make sure we clear the limits and the buffer before reusing it + buffer.clear(); + int bytesRead = cloneFile.read(buffer); + + byte[] bufferToWrite; + if (bytesRead <= 0) { + break; + } else if (bytesRead == bufferBytes.length && !this.storageManager.isReplicated()) { + // ARTEMIS-1220: We cannot reuse the same buffer if it's replicated + // otherwise there could be another thread still using the buffer on a + // replication. + bufferToWrite = bufferBytes; + } else { + bufferToWrite = new byte[bytesRead]; + System.arraycopy(bufferBytes, 0, bufferToWrite, 0, bytesRead); + } + + newMessage.addBytes(bufferToWrite); + + if (bytesRead < bufferBytes.length) { + break; + } } - } - - file.position(oldPosition); - - if (!originallyOpen) { -file.close(false); -newMessage.getFile().close(); + } finally { +cloneFile.close(); } Review comment: @clebertsuconic @gaohoward I don't understand why we only close file when old file is not originally opened? In your case where other consumers open file and deliver, the file of new large message might not be closed. This will result in file leak and maybe data corrupt if broker crashes, wdyt? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 326067) Time Spent: 1h 20m (was: 1h 10m) > Large message's copy may be interfered by other threads > --- > > Key: ARTEMIS-2513 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2513 > Project: ActiveMQ Artemis >
[jira] [Work logged] (ARTEMIS-2513) Large message's copy may be interfered by other threads
[ https://issues.apache.org/jira/browse/ARTEMIS-2513?focusedWorklogId=325794=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325794 ] ASF GitHub Bot logged work on ARTEMIS-2513: --- Author: ASF GitHub Bot Created on: 09/Oct/19 16:11 Start Date: 09/Oct/19 16:11 Worklog Time Spent: 10m Work Description: gaohoward commented on issue #2859: ARTEMIS-2513 Large message's copy may be interfered by other threads URL: https://github.com/apache/activemq-artemis/pull/2859#issuecomment-540072662 I'll run jenkins again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 325794) Time Spent: 1h 10m (was: 1h) > Large message's copy may be interfered by other threads > --- > > Key: ARTEMIS-2513 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2513 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.10.1 >Reporter: Howard Gao >Assignee: Howard Gao >Priority: Major > Fix For: 2.11.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > In LargeMessageImpl.copy(long) it need to open the underlying file in order > to read and copy bytes into the new copied message. However there is a chance > that another thread can come in and close the file in the middle, making the > copy failed with "channel is null" error. > This is happening in cases where a large message is sent to a jms topic > (multicast address). During delivery it to multiple subscribers, some > consumer is doing delivery and closed the underlying file after. Some other > consumer is rolling back the messages and eventually move it to DLQ (which > will call the above copy method). So there is a chance this bug being hit on. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (ARTEMIS-2513) Large message's copy may be interfered by other threads
[ https://issues.apache.org/jira/browse/ARTEMIS-2513?focusedWorklogId=325768=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325768 ] ASF GitHub Bot logged work on ARTEMIS-2513: --- Author: ASF GitHub Bot Created on: 09/Oct/19 15:50 Start Date: 09/Oct/19 15:50 Worklog Time Spent: 10m Work Description: gaohoward commented on issue #2859: ARTEMIS-2513 Large message's copy may be interfered by other threads URL: https://github.com/apache/activemq-artemis/pull/2859#issuecomment-540063855 @clebertsuconic Yes I'm fixing it. Also same issue with hornetq. I'll fix that too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 325768) Time Spent: 1h (was: 50m) > Large message's copy may be interfered by other threads > --- > > Key: ARTEMIS-2513 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2513 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.10.1 >Reporter: Howard Gao >Assignee: Howard Gao >Priority: Major > Fix For: 2.11.0 > > Time Spent: 1h > Remaining Estimate: 0h > > In LargeMessageImpl.copy(long) it need to open the underlying file in order > to read and copy bytes into the new copied message. However there is a chance > that another thread can come in and close the file in the middle, making the > copy failed with "channel is null" error. > This is happening in cases where a large message is sent to a jms topic > (multicast address). During delivery it to multiple subscribers, some > consumer is doing delivery and closed the underlying file after. Some other > consumer is rolling back the messages and eventually move it to DLQ (which > will call the above copy method). So there is a chance this bug being hit on. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (ARTEMIS-2513) Large message's copy may be interfered by other threads
[ https://issues.apache.org/jira/browse/ARTEMIS-2513?focusedWorklogId=325767=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325767 ] ASF GitHub Bot logged work on ARTEMIS-2513: --- Author: ASF GitHub Bot Created on: 09/Oct/19 15:50 Start Date: 09/Oct/19 15:50 Worklog Time Spent: 10m Work Description: gaohoward commented on pull request #2859: ARTEMIS-2513 Large message's copy may be interfered by other threads URL: https://github.com/apache/activemq-artemis/pull/2859#discussion_r333093509 ## File path: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/LargeServerMessageImpl.java ## @@ -368,52 +368,48 @@ public Message copy(final long newID) { try { LargeServerMessage newMessage = storageManager.createLargeMessage(newID, this); - boolean originallyOpen = file != null && file.isOpen(); + //clone a SequentialFile to avoid concurrent access + ensureFileExists(false); + SequentialFile cloneFile = file.cloneFile(); - validateFile(); - - byte[] bufferBytes = new byte[100 * 1024]; - - ByteBuffer buffer = ByteBuffer.wrap(bufferBytes); + try { +byte[] bufferBytes = new byte[100 * 1024]; - long oldPosition = file.position(); +ByteBuffer buffer = ByteBuffer.wrap(bufferBytes); - if (!file.isOpen()) { -file.open(); - } - file.position(0); - - for (;;) { -// The buffer is reused... -// We need to make sure we clear the limits and the buffer before reusing it -buffer.clear(); -int bytesRead = file.read(buffer); - -byte[] bufferToWrite; -if (bytesRead <= 0) { - break; -} else if (bytesRead == bufferBytes.length && !this.storageManager.isReplicated()) { - // ARTEMIS-1220: We cannot reuse the same buffer if it's replicated - // otherwise there could be another thread still using the buffer on a - // replication. - bufferToWrite = bufferBytes; -} else { - bufferToWrite = new byte[bytesRead]; - System.arraycopy(bufferBytes, 0, bufferToWrite, 0, bytesRead); +if (!cloneFile.isOpen()) { + cloneFile.open(); } -newMessage.addBytes(bufferToWrite); - -if (bytesRead < bufferBytes.length) { - break; +cloneFile.position(0); + +for (;;) { + // The buffer is reused... + // We need to make sure we clear the limits and the buffer before reusing it + buffer.clear(); + int bytesRead = cloneFile.read(buffer); + + byte[] bufferToWrite; + if (bytesRead <= 0) { + break; + } else if (bytesRead == bufferBytes.length && !this.storageManager.isReplicated()) { + // ARTEMIS-1220: We cannot reuse the same buffer if it's replicated + // otherwise there could be another thread still using the buffer on a + // replication. + bufferToWrite = bufferBytes; + } else { + bufferToWrite = new byte[bytesRead]; + System.arraycopy(bufferBytes, 0, bufferToWrite, 0, bytesRead); + } + + newMessage.addBytes(bufferToWrite); + + if (bytesRead < bufferBytes.length) { + break; + } } - } - - file.position(oldPosition); - - if (!originallyOpen) { -file.close(false); -newMessage.getFile().close(); + } finally { +cloneFile.close(); } Review comment: yes it should be closed as original code. That'll cause file leak. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 325767) Time Spent: 50m (was: 40m) > Large message's copy may be interfered by other threads > --- > > Key: ARTEMIS-2513 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2513 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.10.1 >Reporter: Howard Gao >Assignee: Howard Gao >Priority: Major > Fix For: 2.11.0 > > Time
[jira] [Work logged] (ARTEMIS-2513) Large message's copy may be interfered by other threads
[ https://issues.apache.org/jira/browse/ARTEMIS-2513?focusedWorklogId=325754=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325754 ] ASF GitHub Bot logged work on ARTEMIS-2513: --- Author: ASF GitHub Bot Created on: 09/Oct/19 15:20 Start Date: 09/Oct/19 15:20 Worklog Time Spent: 10m Work Description: clebertsuconic commented on issue #2859: ARTEMIS-2513 Large message's copy may be interfered by other threads URL: https://github.com/apache/activemq-artemis/pull/2859#issuecomment-540050231 There are files leaking in your test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 325754) Time Spent: 40m (was: 0.5h) > Large message's copy may be interfered by other threads > --- > > Key: ARTEMIS-2513 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2513 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.10.1 >Reporter: Howard Gao >Assignee: Howard Gao >Priority: Major > Fix For: 2.11.0 > > Time Spent: 40m > Remaining Estimate: 0h > > In LargeMessageImpl.copy(long) it need to open the underlying file in order > to read and copy bytes into the new copied message. However there is a chance > that another thread can come in and close the file in the middle, making the > copy failed with "channel is null" error. > This is happening in cases where a large message is sent to a jms topic > (multicast address). During delivery it to multiple subscribers, some > consumer is doing delivery and closed the underlying file after. Some other > consumer is rolling back the messages and eventually move it to DLQ (which > will call the above copy method). So there is a chance this bug being hit on. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (ARTEMIS-2513) Large message's copy may be interfered by other threads
[ https://issues.apache.org/jira/browse/ARTEMIS-2513?focusedWorklogId=325693=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325693 ] ASF GitHub Bot logged work on ARTEMIS-2513: --- Author: ASF GitHub Bot Created on: 09/Oct/19 13:29 Start Date: 09/Oct/19 13:29 Worklog Time Spent: 10m Work Description: wy96f commented on pull request #2859: ARTEMIS-2513 Large message's copy may be interfered by other threads URL: https://github.com/apache/activemq-artemis/pull/2859#discussion_r333012845 ## File path: artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/LargeServerMessageImpl.java ## @@ -368,52 +368,48 @@ public Message copy(final long newID) { try { LargeServerMessage newMessage = storageManager.createLargeMessage(newID, this); - boolean originallyOpen = file != null && file.isOpen(); + //clone a SequentialFile to avoid concurrent access + ensureFileExists(false); + SequentialFile cloneFile = file.cloneFile(); - validateFile(); - - byte[] bufferBytes = new byte[100 * 1024]; - - ByteBuffer buffer = ByteBuffer.wrap(bufferBytes); + try { +byte[] bufferBytes = new byte[100 * 1024]; - long oldPosition = file.position(); +ByteBuffer buffer = ByteBuffer.wrap(bufferBytes); - if (!file.isOpen()) { -file.open(); - } - file.position(0); - - for (;;) { -// The buffer is reused... -// We need to make sure we clear the limits and the buffer before reusing it -buffer.clear(); -int bytesRead = file.read(buffer); - -byte[] bufferToWrite; -if (bytesRead <= 0) { - break; -} else if (bytesRead == bufferBytes.length && !this.storageManager.isReplicated()) { - // ARTEMIS-1220: We cannot reuse the same buffer if it's replicated - // otherwise there could be another thread still using the buffer on a - // replication. - bufferToWrite = bufferBytes; -} else { - bufferToWrite = new byte[bytesRead]; - System.arraycopy(bufferBytes, 0, bufferToWrite, 0, bytesRead); +if (!cloneFile.isOpen()) { + cloneFile.open(); } -newMessage.addBytes(bufferToWrite); - -if (bytesRead < bufferBytes.length) { - break; +cloneFile.position(0); + +for (;;) { + // The buffer is reused... + // We need to make sure we clear the limits and the buffer before reusing it + buffer.clear(); + int bytesRead = cloneFile.read(buffer); + + byte[] bufferToWrite; + if (bytesRead <= 0) { + break; + } else if (bytesRead == bufferBytes.length && !this.storageManager.isReplicated()) { + // ARTEMIS-1220: We cannot reuse the same buffer if it's replicated + // otherwise there could be another thread still using the buffer on a + // replication. + bufferToWrite = bufferBytes; + } else { + bufferToWrite = new byte[bytesRead]; + System.arraycopy(bufferBytes, 0, bufferToWrite, 0, bytesRead); + } + + newMessage.addBytes(bufferToWrite); + + if (bytesRead < bufferBytes.length) { + break; + } } - } - - file.position(oldPosition); - - if (!originallyOpen) { -file.close(false); -newMessage.getFile().close(); + } finally { +cloneFile.close(); } Review comment: Do we need to close underlying file of new large message after copy? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 325693) Time Spent: 0.5h (was: 20m) > Large message's copy may be interfered by other threads > --- > > Key: ARTEMIS-2513 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2513 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.10.1 >Reporter: Howard Gao >Assignee: Howard Gao >Priority: Major > Fix For: 2.11.0 > > Time Spent:
[jira] [Work logged] (ARTEMIS-2513) Large message's copy may be interfered by other threads
[ https://issues.apache.org/jira/browse/ARTEMIS-2513?focusedWorklogId=325683=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325683 ] ASF GitHub Bot logged work on ARTEMIS-2513: --- Author: ASF GitHub Bot Created on: 09/Oct/19 13:18 Start Date: 09/Oct/19 13:18 Worklog Time Spent: 10m Work Description: gaohoward commented on issue #2859: ARTEMIS-2513 Large message's copy may be interfered by other threads URL: https://github.com/apache/activemq-artemis/pull/2859#issuecomment-539996782 pls hold this, it's getting test failures in jenkins. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 325683) Time Spent: 20m (was: 10m) > Large message's copy may be interfered by other threads > --- > > Key: ARTEMIS-2513 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2513 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.10.1 >Reporter: Howard Gao >Assignee: Howard Gao >Priority: Major > Fix For: 2.11.0 > > Time Spent: 20m > Remaining Estimate: 0h > > In LargeMessageImpl.copy(long) it need to open the underlying file in order > to read and copy bytes into the new copied message. However there is a chance > that another thread can come in and close the file in the middle, making the > copy failed with "channel is null" error. > This is happening in cases where a large message is sent to a jms topic > (multicast address). During delivery it to multiple subscribers, some > consumer is doing delivery and closed the underlying file after. Some other > consumer is rolling back the messages and eventually move it to DLQ (which > will call the above copy method). So there is a chance this bug being hit on. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (ARTEMIS-2513) Large message's copy may be interfered by other threads
[ https://issues.apache.org/jira/browse/ARTEMIS-2513?focusedWorklogId=325444=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325444 ] ASF GitHub Bot logged work on ARTEMIS-2513: --- Author: ASF GitHub Bot Created on: 09/Oct/19 02:24 Start Date: 09/Oct/19 02:24 Worklog Time Spent: 10m Work Description: gaohoward commented on pull request #2859: ARTEMIS-2513 Large message's copy may be interfered by other threads URL: https://github.com/apache/activemq-artemis/pull/2859 In LargeMessageImpl.copy(long) it need to open the underlying file in order to read and copy bytes into the new copied message. However there is a chance that another thread can come in and close the file in the middle, making the copy failed with "channel is null" error. This is happening in cases where a large message is sent to a jms topic (multicast address). During delivery it to multiple subscribers, some consumer is doing delivery and closed the underlying file after. Some other consumer is rolling back the messages and eventually move it to DLQ (which will call the above copy method). So there is a chance this bug being hit on. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 325444) Remaining Estimate: 0h Time Spent: 10m > Large message's copy may be interfered by other threads > --- > > Key: ARTEMIS-2513 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2513 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker >Affects Versions: 2.10.1 >Reporter: Howard Gao >Assignee: Howard Gao >Priority: Major > Fix For: 2.11.0 > > Time Spent: 10m > Remaining Estimate: 0h > > In LargeMessageImpl.copy(long) it need to open the underlying file in order > to read and copy bytes into the new copied message. However there is a chance > that another thread can come in and close the file in the middle, making the > copy failed with "channel is null" error. > This is happening in cases where a large message is sent to a jms topic > (multicast address). During delivery it to multiple subscribers, some > consumer is doing delivery and closed the underlying file after. Some other > consumer is rolling back the messages and eventually move it to DLQ (which > will call the above copy method). So there is a chance this bug being hit on. -- This message was sent by Atlassian Jira (v8.3.4#803005)