[Bug 62779] New: Paragraph text search results start with an error marker.

2018-09-29 Thread bugzilla
https://bz.apache.org/bugzilla/show_bug.cgi?id=62779

Bug ID: 62779
   Summary: Paragraph text search results start with an error
marker.
   Product: POI
   Version: 4.0.0-FINAL
  Hardware: PC
Status: NEW
  Severity: normal
  Priority: P2
 Component: XWPF
  Assignee: dev@poi.apache.org
  Reporter: 1042126...@qq.com
  Target Milestone: ---

Created attachment 36177
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=36177=edit
Fix the sample

Fix paragraph text search results start point marking error.

##src/ooxml/java/org/apache/poi/xwpf/usermodel/XWPFParagraph.java
Each loop of rArray in the searchText method causes beginTextPos and
beginCharPos to be reset accordingly. The resulting TextSegment returns invalid
data.

Such as:
Search ${code} in Runs data ["code:${", "code","}"].
The correct result returned should be

### start
startRun 0
startText 0
startChar 5
### end
endRun 2
endText 0
endChar 0
But the actual interior only retains the startRun state, and startText,
startChar are reset at each loop.

startRun 0
startText 0
startChar 0

-- 
You are receiving this mail because:
You are the assignee for the bug.
-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



Re: POI 4.0.0 issues with new commons-compress library "InputStream of class [..] is not implementing InputStreamStatistics"

2018-09-29 Thread Jörn Franke
Don't worry, I guess it was too late in the evening. I simply shade the
dependency to commons-compress and everything seems to work (and I still
can keep the POI integrated security mechanisms). Thanks btw. for 4.0.0

On Sat, Sep 29, 2018 at 11:54 PM Jörn Franke  wrote:

> Dear all,
>
> as part of the HadoopOffice library (
> https://github.com/zuinnote/hadoopoffice/wiki) we provide the
> functionality to read office documents, such as MS Excel, on Big Data
> platforms, such as Hadoop/Hive/Spark/Flink.
>
> I want to release a new version supporting POI 4.0.0, but I have one
> remaining blocking issue: The Big Data platforms use an old version of
> commons-compress (between 1.4.x and 1.9.x). This means I am always running
> into the exception in ZipArchiveThresholdInputStream "InputStream of class
> [..] is not implementing InputStreamStatistics" (
>
> https://svn.apache.org/viewvc/poi/trunk/src/ooxml/java/org/apache/poi/openxml4j/util/ZipArchiveThresholdInputStream.java?view=markup=1832789
> ).
>
> Unfortunately, updating these platforms to the latest commons-compress is
> very intrusive and for many organizations not possible. I need now to find
> a workaround for this. Alternative classpath settings are not working very
> well and create another mess.
>
> Do you have any idea on how I can deal with this check?  Can I inject
> somehow InputStreamStatistics in my InputStream? Or can I somehow inject my
> own ZipArchiveInputStream?
> Alternatively, could Apache POI instead of using ZipArchiveInputStream
> create another class POIZipArchiveInputStream and let this custom class
> extend ArchiveInputStream and implement InputStreamStatistics? This would
> remove all my classpath issues with the Big Data platforms 
>
>
> Thank you.
>
> Best regards
>


Re: POI 4.0.0 issues with new commons-compress library "InputStream of class [..] is not implementing InputStreamStatistics"

2018-09-29 Thread Nick Burch

On Sat, 29 Sep 2018, Jörn Franke wrote:
as part of the HadoopOffice library ( 
https://github.com/zuinnote/hadoopoffice/wiki) we provide the 
functionality to read office documents, such as MS Excel, on Big Data 
platforms, such as Hadoop/Hive/Spark/Flink.


We should probably list that on the website! Do you have a few paragraph 
blurb we can use?



I want to release a new version supporting POI 4.0.0, but I have one
remaining blocking issue: The Big Data platforms use an old version of
commons-compress (between 1.4.x and 1.9.x). This means I am always running
into the exception in ZipArchiveThresholdInputStream "InputStream of class
[..] is not implementing InputStreamStatistics" (
https://svn.apache.org/viewvc/poi/trunk/src/ooxml/java/org/apache/poi/openxml4j/util/ZipArchiveThresholdInputStream.java?view=markup=1832789
).


We need that for security reasons - newer Java versions won't let us 
protect against zip bomb attacks as they inconveniently hide the expansion 
stats, so we had to switch to commons to guard against it.



Unfortunately, updating these platforms to the latest commons-compress is
very intrusive and for many organizations not possible.


Wave some CVEs at them and see if you can tempt an upgrade?

If not, you'd probably need to work with the commons folks to backport the 
zip stats stuff to your old version, so you can keep the security stuff we 
need? dev@commons is moderately quiet and fairly friendly :)


Nick

-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org

POI 4.0.0 issues with new commons-compress library "InputStream of class [..] is not implementing InputStreamStatistics"

2018-09-29 Thread Jörn Franke
Dear all,

as part of the HadoopOffice library (
https://github.com/zuinnote/hadoopoffice/wiki) we provide the functionality
to read office documents, such as MS Excel, on Big Data platforms, such as
Hadoop/Hive/Spark/Flink.

I want to release a new version supporting POI 4.0.0, but I have one
remaining blocking issue: The Big Data platforms use an old version of
commons-compress (between 1.4.x and 1.9.x). This means I am always running
into the exception in ZipArchiveThresholdInputStream "InputStream of class
[..] is not implementing InputStreamStatistics" (
https://svn.apache.org/viewvc/poi/trunk/src/ooxml/java/org/apache/poi/openxml4j/util/ZipArchiveThresholdInputStream.java?view=markup=1832789
).

Unfortunately, updating these platforms to the latest commons-compress is
very intrusive and for many organizations not possible. I need now to find
a workaround for this. Alternative classpath settings are not working very
well and create another mess.

Do you have any idea on how I can deal with this check?  Can I inject
somehow InputStreamStatistics in my InputStream? Or can I somehow inject my
own ZipArchiveInputStream?
Alternatively, could Apache POI instead of using ZipArchiveInputStream
create another class POIZipArchiveInputStream and let this custom class
extend ArchiveInputStream and implement InputStreamStatistics? This would
remove all my classpath issues with the Big Data platforms 


Thank you.

Best regards


[Bug 62778] url-encoded in internal links (Target/TargetURI) for xlsx excel cannot to recognize

2018-09-29 Thread bugzilla
https://bz.apache.org/bugzilla/show_bug.cgi?id=62778

--- Comment #1 from PJ Fanning  ---
Can you provide a junit test case?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[Bug 62778] New: url-encoded in internal links (Target/TargetURI) for xlsx excel cannot to recognize

2018-09-29 Thread bugzilla
https://bz.apache.org/bugzilla/show_bug.cgi?id=62778

Bug ID: 62778
   Summary: url-encoded in internal links (Target/TargetURI) for
xlsx excel cannot to recognize
   Product: POI
   Version: 4.0.x-dev
  Hardware: PC
OS: Linux
Status: NEW
  Severity: normal
  Priority: P2
 Component: XSSF
  Assignee: dev@poi.apache.org
  Reporter: an...@aspinformatica.com.br
  Target Milestone: ---

Created attachment 36176
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=36176=edit
Decode the Target tag before zip part marshallers

The troubleshoot occurs when we write/save an excel xlsx file and this file use
white spaces inside the name of sheets.
For example:
Performance Resume => Performance%20Resume
Performance Resume 2 => Performance%20Resume%202

Excel cannot to recognize this link in shapes, cells, buttons, ...

To resolve this trouble I debug the code in eclipse for a week more or less and
found the source that do it.

I just put one line for decode the string of Target.

Follow the diff in attachment

Thanks for attention

-- 
You are receiving this mail because:
You are the assignee for the bug.
-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



Jenkins build is back to normal : POI-DSL-Maven #658

2018-09-29 Thread Apache Jenkins Server
See 


-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[Bug 60021] Apache POI POM should include runtime dependency on poi-ooxml

2018-09-29 Thread bugzilla
https://bz.apache.org/bugzilla/show_bug.cgi?id=60021

Dominik Stadler  changed:

   What|Removed |Added

Summary|poi POM should include  |Apache POI POM should
   |runtime dependency on   |include runtime dependency
   |poi-ooxml   |on poi-ooxml

-- 
You are receiving this mail because:
You are the assignee for the bug.
-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[Bug 62701] Introduce an API to generate .msg files

2018-09-29 Thread bugzilla
https://bz.apache.org/bugzilla/show_bug.cgi?id=62701

Dominik Stadler  changed:

   What|Removed |Added

   Severity|critical|enhancement

-- 
You are receiving this mail because:
You are the assignee for the bug.
-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[GitHub] poi pull request #126: Fixed text search error.

2018-09-29 Thread zhangfugui727
GitHub user zhangfugui727 opened a pull request:

https://github.com/apache/poi/pull/126

Fixed text search error.

### Fix paragraph text search results start point marking error.
### Such as:
Search `${code}` in Runs data `["code:${", "code","}"]`.
The correct result returned should be
```
### start
startRun 0
startText 0
startChar 5
### end
endRun 2
endText 0
endChar 0
```
But the actual interior only retains the ```startRun``` state, and 
```startText, startChar``` are reset at each loop.
```
startRun 0
startText 0
startChar 0
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhangfugui727/poi trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/poi/pull/126.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #126


commit db519d37c4439d15d9c4c2178490977f3422f15c
Author: zhangfugui <30425581+zhangfugui727@...>
Date:   2018-09-29T09:10:00Z

Fixed text search error.

Such as:
Search ${code} in Runs data (["code:${", "code","}"])
The correct result returned should be
StartRun 0
StartText 0
StartChar 5
EndRun 2
EndText 0
EndChar 0
But the actual interior only retains the startRun state, and startText, 
startChar are reset at each loop.
StartRun 0
StartText 0
StartChar 0




---

-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org