Update AWS S3 log format deserializer regex
-------------------------------------------
Key: HIVE-1483
URL: https://issues.apache.org/jira/browse/HIVE-1483
Project: Hadoop Hive
Issue Type: Bug
Components: Serializers/Deserializers
Affects Versions: 0.5.0, 0.6.0, 0.7.0
Reporter: Carl Steinbach
>From HIVE-693:
Amazon Changed the format of the logs at the beginning of February 2010, so now
the new regex is:
static Pattern regexpat = Pattern.compile( "(
S+) (
S+) \\[(.?)
] (
S+) (
S+) (
S+) (
S+) (
S+) \"(.)\" (
S) (
S+) (
S+) (
S+) (
S+) (
S+) \"(.)\" \"(.*)\"(?: -)?");
(the only difference is addition of (?: -)? at the end.
Since Amazon hasn't yet documented the last field, I don't know if it is ok to
do a catch-all regex for that field instead of the very specific one I've added.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.