[jira] [Commented] (IO-802) Restore threadlocal for skipfully() byte buffer

2023-06-13 Thread Phil Steitz (Jira)


[ 
https://issues.apache.org/jira/browse/IO-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17732273#comment-17732273
 ] 

Phil Steitz commented on IO-802:


[~ggregory] Makes sense.  Thanks.

> Restore threadlocal for skipfully() byte buffer
> ---
>
> Key: IO-802
> URL: https://issues.apache.org/jira/browse/IO-802
> Project: Commons IO
>  Issue Type: Bug
>Reporter: Tim Allison
>Priority: Major
>
> Over on TIKA-4065, we found that trying to upgrade to commons-io 2.12.0 or 
> 2.13.0 caused one of our unit tests to fail.  We found that dropping 
> {{threadlocal}} on the buffer used in IOUtils.skipFully() in conjunction with 
> Java's InflaterInputStream was the cause of the problem.
> Our unit test shows that running skipFully() on a stream and then reading 
> gets different results on the same underlying stream when running 
> multithreaded.  This is really bad.  It appears to be confined to 
> InflaterInputStream...so not a very common case.
> On the [commons-io's user 
> list|https://lists.apache.org/thread/rxfyxqochnj7bw75nr2v7hf5qtkogx7d] 
> [~psteitz] observed that Java's InflaterInputStream expects read access to 
> the byte array passed in...so having multiple threads writing to the same 
> static (not-thread local) byte array is dangerous.  The behavior of Java's 
> InflaterInputStream is surprising and not documented.
> I have a demonstration of the problem here: 
> https://github.com/tballison/commons-io/blob/TIKA-4065/src/test/java/org/apache/commons/io/IOUtilsMultithreadedTest.java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IO-802) Restore threadlocal for skipfully() byte buffer

2023-06-13 Thread Gary D. Gregory (Jira)


[ 
https://issues.apache.org/jira/browse/IO-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17732268#comment-17732268
 ] 

Gary D. Gregory commented on IO-802:


The issue the zero fill addresses is avoiding an input stream implementation 
having access to the data that a previous call wrote. This might be overly 
paranoid but it certainly feels safer when using a shared cache. Granted it's 
shared only within a class loader but it might be possible for some container 
(servlet container or something else) that co-locates applications to 
inadvertently provide an attack vector. It does seem far fetched but I thought 
I'd mention it anyway.

> Restore threadlocal for skipfully() byte buffer
> ---
>
> Key: IO-802
> URL: https://issues.apache.org/jira/browse/IO-802
> Project: Commons IO
>  Issue Type: Bug
>Reporter: Tim Allison
>Priority: Major
>
> Over on TIKA-4065, we found that trying to upgrade to commons-io 2.12.0 or 
> 2.13.0 caused one of our unit tests to fail.  We found that dropping 
> {{threadlocal}} on the buffer used in IOUtils.skipFully() in conjunction with 
> Java's InflaterInputStream was the cause of the problem.
> Our unit test shows that running skipFully() on a stream and then reading 
> gets different results on the same underlying stream when running 
> multithreaded.  This is really bad.  It appears to be confined to 
> InflaterInputStream...so not a very common case.
> On the [commons-io's user 
> list|https://lists.apache.org/thread/rxfyxqochnj7bw75nr2v7hf5qtkogx7d] 
> [~psteitz] observed that Java's InflaterInputStream expects read access to 
> the byte array passed in...so having multiple threads writing to the same 
> static (not-thread local) byte array is dangerous.  The behavior of Java's 
> InflaterInputStream is surprising and not documented.
> I have a demonstration of the problem here: 
> https://github.com/tballison/commons-io/blob/TIKA-4065/src/test/java/org/apache/commons/io/IOUtilsMultithreadedTest.java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IO-802) Restore threadlocal for skipfully() byte buffer

2023-06-13 Thread Phil Steitz (Jira)


[ 
https://issues.apache.org/jira/browse/IO-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17732238#comment-17732238
 ] 

Phil Steitz commented on IO-802:


I agree that it would be best to just add back the protection, for the reasons 
above.  The jdk developers at least in the InflaterInputStream case seem to 
think concurrent writes to a read buffer should not happen, so I would 
recommend eliminating this risk.   I would also evaluate the need for zero-ing 
for the write only one.  Is it needed?  

> Restore threadlocal for skipfully() byte buffer
> ---
>
> Key: IO-802
> URL: https://issues.apache.org/jira/browse/IO-802
> Project: Commons IO
>  Issue Type: Bug
>Reporter: Tim Allison
>Priority: Major
>
> Over on TIKA-4065, we found that trying to upgrade to commons-io 2.12.0 or 
> 2.13.0 caused one of our unit tests to fail.  We found that dropping 
> {{threadlocal}} on the buffer used in IOUtils.skipFully() in conjunction with 
> Java's InflaterInputStream was the cause of the problem.
> Our unit test shows that running skipFully() on a stream and then reading 
> gets different results on the same underlying stream when running 
> multithreaded.  This is really bad.  It appears to be confined to 
> InflaterInputStream...so not a very common case.
> On the [commons-io's user 
> list|https://lists.apache.org/thread/rxfyxqochnj7bw75nr2v7hf5qtkogx7d] 
> [~psteitz] observed that Java's InflaterInputStream expects read access to 
> the byte array passed in...so having multiple threads writing to the same 
> static (not-thread local) byte array is dangerous.  The behavior of Java's 
> InflaterInputStream is surprising and not documented.
> I have a demonstration of the problem here: 
> https://github.com/tballison/commons-io/blob/TIKA-4065/src/test/java/org/apache/commons/io/IOUtilsMultithreadedTest.java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IO-802) Restore threadlocal for skipfully() byte buffer

2023-06-13 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/IO-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17732185#comment-17732185
 ] 

Tilman Hausherr commented on IO-802:


I agree. Simple stuff like that {{IOUtils.skipFully(input, toSkip)}} should 
"just work". Also there may be other code out there and people won't know that 
they need to use this very special new method if they update from 2.11.0 (and 
they will many, if dependabot some day fixes the bug that it didn't detect that 
there is a new version of commons-io).

> Restore threadlocal for skipfully() byte buffer
> ---
>
> Key: IO-802
> URL: https://issues.apache.org/jira/browse/IO-802
> Project: Commons IO
>  Issue Type: Bug
>Reporter: Tim Allison
>Priority: Major
>
> Over on TIKA-4065, we found that trying to upgrade to commons-io 2.12.0 or 
> 2.13.0 caused one of our unit tests to fail.  We found that dropping 
> {{threadlocal}} on the buffer used in IOUtils.skipFully() in conjunction with 
> Java's InflaterInputStream was the cause of the problem.
> Our unit test shows that running skipFully() on a stream and then reading 
> gets different results on the same underlying stream when running 
> multithreaded.  This is really bad.  It appears to be confined to 
> InflaterInputStream...so not a very common case.
> On the [commons-io's user 
> list|https://lists.apache.org/thread/rxfyxqochnj7bw75nr2v7hf5qtkogx7d] 
> [~psteitz] observed that Java's InflaterInputStream expects read access to 
> the byte array passed in...so having multiple threads writing to the same 
> static (not-thread local) byte array is dangerous.  The behavior of Java's 
> InflaterInputStream is surprising and not documented.
> I have a demonstration of the problem here: 
> https://github.com/tballison/commons-io/blob/TIKA-4065/src/test/java/org/apache/commons/io/IOUtilsMultithreadedTest.java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IO-802) Restore threadlocal for skipfully() byte buffer

2023-06-13 Thread Tim Allison (Jira)


[ 
https://issues.apache.org/jira/browse/IO-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17732161#comment-17732161
 ] 

Tim Allison commented on IO-802:


My concern with that approach is that users would need to know when they have 
to take matters into their own hands.  {{InflaterInputStream}} is weird and 
does not document what it is doing, but there could be other InputStreams that 
behave similarly.  If we didn't have multithreaded tests on Tika, we would 
never have known that {{InflaterInputStream}} is, um, surprising.

> Restore threadlocal for skipfully() byte buffer
> ---
>
> Key: IO-802
> URL: https://issues.apache.org/jira/browse/IO-802
> Project: Commons IO
>  Issue Type: Bug
>Reporter: Tim Allison
>Priority: Major
>
> Over on TIKA-4065, we found that trying to upgrade to commons-io 2.12.0 or 
> 2.13.0 caused one of our unit tests to fail.  We found that dropping 
> {{threadlocal}} on the buffer used in IOUtils.skipFully() in conjunction with 
> Java's InflaterInputStream was the cause of the problem.
> Our unit test shows that running skipFully() on a stream and then reading 
> gets different results on the same underlying stream when running 
> multithreaded.  This is really bad.  It appears to be confined to 
> InflaterInputStream...so not a very common case.
> On the [commons-io's user 
> list|https://lists.apache.org/thread/rxfyxqochnj7bw75nr2v7hf5qtkogx7d] 
> [~psteitz] observed that Java's InflaterInputStream expects read access to 
> the byte array passed in...so having multiple threads writing to the same 
> static (not-thread local) byte array is dangerous.  The behavior of Java's 
> InflaterInputStream is surprising and not documented.
> I have a demonstration of the problem here: 
> https://github.com/tballison/commons-io/blob/TIKA-4065/src/test/java/org/apache/commons/io/IOUtilsMultithreadedTest.java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IO-802) Restore threadlocal for skipfully() byte buffer

2023-06-13 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/IO-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17732157#comment-17732157
 ] 

Hudson commented on IO-802:
---

SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1113 (See 
[https://ci-builds.apache.org/job/Tika/job/tika-main-jdk11/1113/])
TIKA-4065 -- upgrade commons-io and add workaround for IO-802 until that is 
fixed. (tallison: 
[https://github.com/apache/tika/commit/8793fad6464facfa5a29dae368a85ede106d2230])
* (edit) tika-parent/pom.xml
* (edit) 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-miscoffice-module/src/main/java/org/apache/tika/parser/hwp/HwpStreamReader.java


> Restore threadlocal for skipfully() byte buffer
> ---
>
> Key: IO-802
> URL: https://issues.apache.org/jira/browse/IO-802
> Project: Commons IO
>  Issue Type: Bug
>Reporter: Tim Allison
>Priority: Major
>
> Over on TIKA-4065, we found that trying to upgrade to commons-io 2.12.0 or 
> 2.13.0 caused one of our unit tests to fail.  We found that dropping 
> {{threadlocal}} on the buffer used in IOUtils.skipFully() in conjunction with 
> Java's InflaterInputStream was the cause of the problem.
> Our unit test shows that running skipFully() on a stream and then reading 
> gets different results on the same underlying stream when running 
> multithreaded.  This is really bad.  It appears to be confined to 
> InflaterInputStream...so not a very common case.
> On the [commons-io's user 
> list|https://lists.apache.org/thread/rxfyxqochnj7bw75nr2v7hf5qtkogx7d] 
> [~psteitz] observed that Java's InflaterInputStream expects read access to 
> the byte array passed in...so having multiple threads writing to the same 
> static (not-thread local) byte array is dangerous.  The behavior of Java's 
> InflaterInputStream is surprising and not documented.
> I have a demonstration of the problem here: 
> https://github.com/tballison/commons-io/blob/TIKA-4065/src/test/java/org/apache/commons/io/IOUtilsMultithreadedTest.java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IO-802) Restore threadlocal for skipfully() byte buffer

2023-06-13 Thread Gary D. Gregory (Jira)


[ 
https://issues.apache.org/jira/browse/IO-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17732129#comment-17732129
 ] 

Gary D. Gregory commented on IO-802:


This feels to me like such an unusual case that we should offer a new API where 
the call site can provide its own buffer:

{{IOUtils.skipFully(InputStream, long, byte[])}}

> Restore threadlocal for skipfully() byte buffer
> ---
>
> Key: IO-802
> URL: https://issues.apache.org/jira/browse/IO-802
> Project: Commons IO
>  Issue Type: Bug
>Reporter: Tim Allison
>Priority: Major
>
> Over on TIKA-4065, we found that trying to upgrade to commons-io 2.12.0 or 
> 2.13.0 caused one of our unit tests to fail.  We found that dropping 
> {{threadlocal}} on the buffer used in IOUtils.skipFully() in conjunction with 
> Java's InflaterInputStream was the cause of the problem.
> Our unit test shows that running skipFully() on a stream and then reading 
> gets different results on the same underlying stream when running 
> multithreaded.  This is really bad.  It appears to be confined to 
> InflaterInputStream...so not a very common case.
> On the [commons-io's user 
> list|https://lists.apache.org/thread/rxfyxqochnj7bw75nr2v7hf5qtkogx7d] 
> [~psteitz] observed that Java's InflaterInputStream expects read access to 
> the byte array passed in...so having multiple threads writing to the same 
> static (not-thread local) byte array is dangerous.  The behavior of Java's 
> InflaterInputStream is surprising and not documented.
> I have a demonstration of the problem here: 
> https://github.com/tballison/commons-io/blob/TIKA-4065/src/test/java/org/apache/commons/io/IOUtilsMultithreadedTest.java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)