Hi Kamil, I bet that it is one specific file that was causing the problem. By increasing the stack space, you allowed the file to be processed. Now it won't get processed again until it changes.
My thought is that this is *probably* related to Tika. Are you using the Tika transformer? Karl On Wed, Apr 15, 2015 at 9:11 AM, Kamil Żyta <[email protected]> wrote: > I stopped all agents, removed all logs, add '-Xss500m' to options file, > started agents and errors are gone. Now I removed '-Xss500m' from options > to trap the source of the problem, restart all agents and still no errors. > > *magic* > > Thx Karl for you patience and my weird problems. > > K > > On Wed, Apr 15, 2015 at 08:39:52AM -0400, Karl Wright wrote: > > Hi Kamil, > > > > I believe your logs are probably "rolling". This means that when the log > > gets full, or another day starts, a new log file starts. I don't know, > of > > course, because I did not configure your system. > > > > What I *do* know is that the stack trace that you are providing me is > > incomplete, and while it is clear that the Java regular expression parser > > is failing in some way (by doing infinite recursion), I have no idea what > > *context* this is occurring in, without the end of that stack trace. > > > > This may be occurring almost anywhere, which is why I need the trace. > Even > > String.replace() and String.split() use regexps and can be at fault. > > Without a definitive source, there's little I can do. > > > > One thing you can certainly try is to provide a larger amount of stack > > space to the JVM and just hope the problem goes away. That would mean > > editing one of the options files and adding a parameter: > > > > -Xss500m > > > > (for instance) > > > > If you would rather get to the source of the problem, I suggest the > > following: > > > > (1) Shut down all agents processes > > (2) Remove all logs > > (3) Start the agents process > > (4) Tail the log looking for "FATAL": tail -f manifoldcf.log | grep FATAL > > (5) As soon as you see that, shut down the agents process > > (6) Look at the log file produced > > > > References: > > > http://stackoverflow.com/questions/7509905/java-lang-stackoverflowerror-while-using-a-regex-to-parse-big-strings > > > > Karl > > > > > > On Wed, Apr 15, 2015 at 8:28 AM, Kamil Żyta <[email protected]> > wrote: > > > > > # java -version > > > java version "1.8.0_45" > > > Java(TM) SE Runtime Environment (build 1.8.0_45-b14) > > > Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode) > > > > > > it's broken? I don't know. How can I prevend rolling backtrace? > > > It's look like infinity loop for me. > > > > > > K > > > > > > On Wed, Apr 15, 2015 at 07:41:37AM -0400, Karl Wright wrote: > > > > Clearly the logs must have rolled then? Either that or you are > using a > > > > broken jdk. > > > > > > > > Karl > > > > > > > > > > > > On Wed, Apr 15, 2015 at 7:37 AM, Kamil Żyta <[email protected]> > > > wrote: > > > > > > > > > On Wed, Apr 15, 2015 at 07:27:56AM -0400, Karl Wright wrote: > > > > > > Hi Kamil: > > > > > > > > > > > > kawright@duck76:/data/kawright/analysis$ gzip --version > > > > > > gzip 1.4 > > > > > > Copyright (C) 2007 Free Software Foundation, Inc. > > > > > > Copyright (C) 1993 Jean-loup Gailly. > > > > > > This is free software. You may redistribute copies of it under > the > > > > > terms of > > > > > > the GNU General Public License < > http://www.gnu.org/licenses/gpl.html > > > >. > > > > > > There is NO WARRANTY, to the extent permitted by law. > > > > > > > > > > > > Written by Jean-loup Gailly. > > > > > > kawright@duck76:/data/kawright/analysis$ > > > > > > > > > > > > > > > > > > But in any case the key part of the stack trace is further down, > > > probably > > > > > > MUCH further down. > > > > > > > > > > > > If I were you, I'd unzip the whole log and use head, tail, and > grep > > > to > > > > > find > > > > > > where the exception trace ends. > > > > > > > > > > I use grep -v and send you logs before but you don't belive me. > > > > > It's all mcf logs http://pastebin.com/T54NKwTh > > > > > http://pastebin.com/uMxaUnGi > > > > > > > > > > K > > > > > > > > > > > > > > > > > On Wed, Apr 15, 2015 at 7:18 AM, Kamil Żyta < > [email protected]> > > > > > wrote: > > > > > > > > > > > > > hmm, try tar -xf manifoldcf.log.gz or maybe zless? > > > > > > > It's work for me with: > > > > > > > > gzip --version > > > > > > > gzip 1.6 > > > > > > > > > > > > > > For sure I attached uncompressed file. > > > > > > > > > > > > > > K > > > > > > > > > > > > > > On Wed, Apr 15, 2015 at 07:10:07AM -0400, Karl Wright wrote: > > > > > > > > Hi Kamil, > > > > > > > > > > > > > > > > >>>>>> > > > > > > > > kawright@duck76:~$ cd /data/kawright/analysis/ > > > > > > > > kawright@duck76:/data/kawright/analysis$ gunzip > > > manifoldcf.log.gz > > > > > > > > > > > > > > > > gzip: manifoldcf.log.gz: invalid compressed data--crc error > > > > > > > > > > > > > > > > gzip: manifoldcf.log.gz: invalid compressed data--length > error > > > > > > > > kawright@duck76:/data/kawright/analysis$ > > > > > > > > > > > > > > > > <<<<<< > > > > > > > > > > > > > > > > Karl > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 15, 2015 at 6:41 AM, Kamil Żyta < > > > [email protected]> > > > > > > > wrote: > > > > > > > > > > > > > > > > > these 1k lines are the same. I attached full > manifoldcf.log. > > > > > > > > > > > > > > > > > > K > > > > > > > > > > > > > > > > > > On Wed, Apr 15, 2015 at 06:33:06AM -0400, Karl Wright > wrote: > > > > > > > > > > Hi Kamil, > > > > > > > > > > > > > > > > > > > > There is a complete trace in there, believe me. The JVM > did > > > not > > > > > > > say: " > > > > > > > > > (...) > > > > > > > > > > ~1k lines". What I need is at the bottom of those 1K > lines. > > > > > > > > > > > > > > > > > > > > Karl > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 15, 2015 at 6:23 AM, Kamil Żyta < > > > > > [email protected]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > How can I provide usable stack trace? I can only copy > what > > > logs > > > > > > > says. > > > > > > > > > > > Now it's a lot of: > > > > > > > > > > > FATAL 2015-04-15 12:14:35,645 (Worker thread '5') - > Error > > > > > tossed: > > > > > > > null > > > > > > > > > > > java.lang.StackOverflowError > > > > > > > > > > > at > > > > > > > > > > java.util.regex.Pattern$CharProperty.match(Pattern.java:3776) > > > > > > > > > > > at > > > > > java.util.regex.Pattern$Curly.match0(Pattern.java:4250) > > > > > > > > > > > at > > > > > java.util.regex.Pattern$Curly.match0(Pattern.java:4263) > > > > > > > > > > > (...) ~1k lines > > > > > > > > > > > > > > > > > > > > > > for continuous job but agents is not exiting. Propably > > > this two > > > > > > > errors > > > > > > > > > > > below isn't correlated (patterns and agents oom). > > > > > > > > > > > > > > > > > > > > > > K > > > > > > > > > > > > > > > > > > > > > > On Tue, Apr 14, 2015 at 05:28:18PM -0400, Karl Wright > > > wrote: > > > > > > > > > > > > Without some kind of usable stack trace I can't > really > > > help > > > > > > > you. It > > > > > > > > > > > looks > > > > > > > > > > > > like some regular expression is going completely > haywire, > > > > > but I > > > > > > > have > > > > > > > > > no > > > > > > > > > > > > idea which one. > > > > > > > > > > > > > > > > > > > > > > > > Karl > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Apr 14, 2015 at 4:31 PM, Kamil Żyta < > > > > > > > [email protected]> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Apr 14, 2015 at 04:12:55PM -0400, Karl > Wright > > > > > wrote: > > > > > > > > > > > > > > Hi Kamil, > > > > > > > > > > > > > > > > > > > > > > > > > > > > Without the bottom of the stack trace, I can't > even > > > tell > > > > > > > what it > > > > > > > > > is > > > > > > > > > > > > > doing. > > > > > > > > > > > > > > Where are you supplying a regular expression? > > > > > > > > > > > > > > > > > > > > > > > > > > It's all I have, the only regular expression is in > > > 'Paths': > > > > > > > > > > > > > 3. Exclude file(s) or directory(s) matching */.* > > > > > > > > > > > > > > > > > > > > > > > > > > I found files (~500MB, logs) where solr logs ends, > > > > > > > > > > > > > exclude them solves the problem. mcf use tika for > > > > > extracting > > > > > > > > > > > > > and only /update to solr, these files causes > problem > > > befor, > > > > > > > > > > > > > when using solr for extract docs. Now mcf dies and > I > > > do not > > > > > > > even > > > > > > > > > know > > > > > > > > > > > why. > > > > > > > > > > > > > > > > > > > > > > > > > > K > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Running out of memory might be a side effect of > > > running > > > > > out > > > > > > > of > > > > > > > > > stack. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Karl > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Apr 14, 2015 at 2:49 PM, Kamil Żyta < > > > > > > > > > [email protected]> > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > agent process exit with: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > agents process ran out of memory - shutting > down > > > > > > > > > > > > > > > java.lang.OutOfMemoryError: Java heap space > > > > > > > > > > > > > > > at > > > > > java.util.Arrays.copyOfRange(Arrays.java:3664) > > > > > > > > > > > > > > > at > java.lang.String.<init>(String.java:201) > > > > > > > > > > > > > > > at > > > > > > > > > java.lang.StringBuilder.toString(StringBuilder.java:407) > > > > > > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.buildSolrDocument(HttpPoster.java:987) > > > > > > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:882) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > workers threads: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > FATAL 2015-04-14 18:59:11,172 (Worker thread > '32') > > > - > > > > > Error > > > > > > > > > tossed: > > > > > > > > > > > null > > > > > > > > > > > > > > > java.lang.StackOverflowError > > > > > > > > > > > > > > > at > > > > > > > > > > > > > > > > > > java.util.regex.Pattern$CharProperty.match(Pattern.java:3776) > > > > > > > > > > > > > > > at > > > > > > > > > java.util.regex.Pattern$Curly.match0(Pattern.java:4250) > > > > > > > > > > > > > > > at > > > > > > > > > java.util.regex.Pattern$Curly.match0(Pattern.java:4263) > > > > > > > > > > > > > > > at > > > > > > > > > java.util.regex.Pattern$Curly.match0(Pattern.java:4263) > > > > > > > > > > > > > > > at > > > > > > > > > java.util.regex.Pattern$Curly.match0(Pattern.java:4263) > > > > > > > > > > > > > > > (...) ~1k lines > > > > > > > > > > > > > > > at > > > > > > > > > java.util.regex.Pattern$Curly.match0(Pattern.java:4263) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > no errors/warns in solr logs. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > it's bug or just corrupted file? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > K > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
