Hi Jon,

thanks for reply, REGEX_EXTRACT looks pretty useful. But unfortunately I am
not that good in regex.

can you please give one example what will be regex here to extract data time
part.

Thanks again.

Meenal

On Mon, Jun 27, 2011 at 5:53 AM, Jonathan Holloway <
[email protected]> wrote:

> Take a look at:
>
> REGEX_EXTRACT -
> http://pig.apache.org/docs/r0.8.0/piglatin_ref2.html#REGEX_EXTRACT
>
> and REGEX_EXTRACT_ALL:
>
> http://pig.apache.org/docs/r0.8.0/piglatin_ref2.html#REGEX_EXTRACT_ALL
>
> You could also use SUBSTRING, but I think a regex would be more applicable
> here for date/time extraction.
>
> Cheers,
> Jon.
>
> On 27 June 2011 08:49, abh not <[email protected]> wrote:
>
> > Hi All,
> >
> > I have few sample log:
> >
> >   139.12.0.2 - - [10/Apr/2007:10:40:54 +0300] "GET /favicon.ico HTTP/1.1"
> > 200 766 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3)
> > Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
> >
> > If load this file in as string
> >
> > a = load '/user/sample/log.txt' using PigStorage('/t') as (text:
> > chararray);
> >
> > then how can I extract a part of string from it, for example if I want to
> > extract date  '10/Apr/2007:10:40:54' from it, Then can I achieve this
> thing
> > using Pig script?
> >
> > Any help or suggestions are welcome.
> >
> > Thanks in advance.
> >
> > Meenal
> >
>

Reply via email to