RegExLoader stops an non-matching line
--------------------------------------

                 Key: PIG-593
                 URL: https://issues.apache.org/jira/browse/PIG-593
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.1.0
            Reporter: Vadim Zaliva


Class RegExLoader and all its subclasses stop if some of lines does not match 
provided regular expression.

In particular, I have noticed this when CombinedLogLoader stopped at the 
following line:

58.210.62.24 - - [29/Dec/2008:23:06:57 -0800] "GET /tor/browse/?id=24746&rel=FLY
999%40Jack's+Teen+America+22%2FFLY999原創%40單掛D.C.資訊交流網+Jack's+Teen+Ameri
ca+22+cd1.avi HTTP/1.1" 8952 200 "http://img252.imageshack.us/tor/browse/?id=247
46&rel=FLY999%40Jack%27s+Teen+America+22" "Mozilla/4.0 (compatible; MSIE 6.0; Wi
ndows NT 5.1; )" "-"

Looks like some japanese characters here do not match \S expression used.  

In general I expect it to skip such lines, not to stop processing data file.









-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to