[
https://issues.apache.org/jira/browse/MAHOUT-403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144543#comment-13144543
]
Grant Ingersoll commented on MAHOUT-403:
----------------------------------------
Here's an example of running it against some Solr request logs:
{quote}
--input /path/to/logs --output /tmp/solr/output --regex
"(?<=(\?|&)q=).*?(?=&|$)" --overwrite --transformerClass url
{quote}
> Regex to Various Output Formats
> -------------------------------
>
> Key: MAHOUT-403
> URL: https://issues.apache.org/jira/browse/MAHOUT-403
> Project: Mahout
> Issue Type: New Feature
> Reporter: Grant Ingersoll
> Assignee: Grant Ingersoll
> Priority: Minor
> Attachments: MAHOUT-403.patch, MAHOUT-403.patch
>
>
> Would be great to have a M/R job that took in a line, applied a regex to it
> and then used the capturing groups as output to various formats (FPG,
> Classifier, etc.)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira