[jira] [Commented] (TIKA-1121) Socket server text parsing error on large text files

2014-01-15 Thread Mane (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872161#comment-13872161
 ] 

Mane commented on TIKA-1121:


any update on this one? I can see that the snapshot version has a fix for files 
that were throwing server error. We are getting plenty of those with version 
1.4. Can you check and release a patch? or is there any update on release date 
of Tika 1.5? Thanks

 Socket server text parsing error on large text files
 

 Key: TIKA-1121
 URL: https://issues.apache.org/jira/browse/TIKA-1121
 Project: Tika
  Issue Type: Bug
  Components: cli
Affects Versions: 1.4
 Environment: Ubuntu 10.04, 10.10, 12.04.02
Reporter: Dave Meikle
Assignee: Dave Meikle

 As reported on the user list[1], when using the tika-app socket server 
 command with the -t switch to parse text, the process hangs on large text 
 files.
 This occurs on Ubuntu 10.04, 10.10 and 12.04.02.
 [1]http://mail-archives.apache.org/mod_mbox/tika-user/201305.mbox/%3ccagxbzufxsj4h5jwdeux9hhd2fxttq1vsbm7u-vfsyge9vmr...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (TIKA-1121) Socket server text parsing error on large text files

2013-12-11 Thread Mane (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13845569#comment-13845569
 ] 

Mane commented on TIKA-1121:


I've tested with tika jax-rs server, it works with cases where it was sending 
me server error (tried with pdf files which are still returning Server Error on 
version 1.4). 

But it still hangs on very large html file when I used tika app in server mode 
(socket server). 

 Socket server text parsing error on large text files
 

 Key: TIKA-1121
 URL: https://issues.apache.org/jira/browse/TIKA-1121
 Project: Tika
  Issue Type: Bug
  Components: cli
Affects Versions: 1.4
 Environment: Ubuntu 10.04, 10.10, 12.04.02
Reporter: Dave Meikle
Assignee: Dave Meikle

 As reported on the user list[1], when using the tika-app socket server 
 command with the -t switch to parse text, the process hangs on large text 
 files.
 This occurs on Ubuntu 10.04, 10.10 and 12.04.02.
 [1]http://mail-archives.apache.org/mod_mbox/tika-user/201305.mbox/%3ccagxbzufxsj4h5jwdeux9hhd2fxttq1vsbm7u-vfsyge9vmr...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (TIKA-1121) Socket server text parsing error on large text files

2013-12-09 Thread Sergey Beryozkin (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843289#comment-13843289
 ] 

Sergey Beryozkin commented on TIKA-1121:


That one seems to link to the latest one, I can see from a pom that it includes 
a CXF 2.7.8 dep, but I'm not sure the attachment related code is there too. Try 
it please.  

 Socket server text parsing error on large text files
 

 Key: TIKA-1121
 URL: https://issues.apache.org/jira/browse/TIKA-1121
 Project: Tika
  Issue Type: Bug
  Components: cli
Affects Versions: 1.4
 Environment: Ubuntu 10.04, 10.10, 12.04.02
Reporter: Dave Meikle
Assignee: Dave Meikle

 As reported on the user list[1], when using the tika-app socket server 
 command with the -t switch to parse text, the process hangs on large text 
 files.
 This occurs on Ubuntu 10.04, 10.10 and 12.04.02.
 [1]http://mail-archives.apache.org/mod_mbox/tika-user/201305.mbox/%3ccagxbzufxsj4h5jwdeux9hhd2fxttq1vsbm7u-vfsyge9vmr...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (TIKA-1121) Socket server text parsing error on large text files

2013-12-05 Thread Sergey Beryozkin (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13840252#comment-13840252
 ] 

Sergey Beryozkin commented on TIKA-1121:


Can you experiment with the latest snapshots ? You may have to build the 
tika-server manually. Use multipart/form-data payloads, though it might also do 
better even with the regular requests now as I've removed a call leading to 
creating a temp FileInputStream

 Socket server text parsing error on large text files
 

 Key: TIKA-1121
 URL: https://issues.apache.org/jira/browse/TIKA-1121
 Project: Tika
  Issue Type: Bug
  Components: cli
Affects Versions: 1.4
 Environment: Ubuntu 10.04, 10.10, 12.04.02
Reporter: Dave Meikle
Assignee: Dave Meikle

 As reported on the user list[1], when using the tika-app socket server 
 command with the -t switch to parse text, the process hangs on large text 
 files.
 This occurs on Ubuntu 10.04, 10.10 and 12.04.02.
 [1]http://mail-archives.apache.org/mod_mbox/tika-user/201305.mbox/%3ccagxbzufxsj4h5jwdeux9hhd2fxttq1vsbm7u-vfsyge9vmr...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (TIKA-1121) Socket server text parsing error on large text files

2013-11-20 Thread Mane (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827739#comment-13827739
 ] 

Mane commented on TIKA-1121:


Is there any update on this? I am trying to parse large html/pdf files and tika 
socket server stops responding. Also, I tried Tika Jaxrs network app server 
which throws 500 Server Error. Can someone look into this please?

 Socket server text parsing error on large text files
 

 Key: TIKA-1121
 URL: https://issues.apache.org/jira/browse/TIKA-1121
 Project: Tika
  Issue Type: Bug
  Components: cli
Affects Versions: 1.4
 Environment: Ubuntu 10.04, 10.10, 12.04.02
Reporter: Dave Meikle
Assignee: Dave Meikle

 As reported on the user list[1], when using the tika-app socket server 
 command with the -t switch to parse text, the process hangs on large text 
 files.
 This occurs on Ubuntu 10.04, 10.10 and 12.04.02.
 [1]http://mail-archives.apache.org/mod_mbox/tika-user/201305.mbox/%3ccagxbzufxsj4h5jwdeux9hhd2fxttq1vsbm7u-vfsyge9vmr...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (TIKA-1121) Socket server text parsing error on large text files

2013-11-20 Thread Sergey Beryozkin (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827856#comment-13827856
 ] 

Sergey Beryozkin commented on TIKA-1121:


I can look at offering a CXF-specific extension to Tika JAX-RS Server to do 
with supporting multiparts. It should fix the issue. CXF is very effective in 
that it will store big attachments to the disk temp storage if needed. 

 Socket server text parsing error on large text files
 

 Key: TIKA-1121
 URL: https://issues.apache.org/jira/browse/TIKA-1121
 Project: Tika
  Issue Type: Bug
  Components: cli
Affects Versions: 1.4
 Environment: Ubuntu 10.04, 10.10, 12.04.02
Reporter: Dave Meikle
Assignee: Dave Meikle

 As reported on the user list[1], when using the tika-app socket server 
 command with the -t switch to parse text, the process hangs on large text 
 files.
 This occurs on Ubuntu 10.04, 10.10 and 12.04.02.
 [1]http://mail-archives.apache.org/mod_mbox/tika-user/201305.mbox/%3ccagxbzufxsj4h5jwdeux9hhd2fxttq1vsbm7u-vfsyge9vmr...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.1#6144)