Hi Jon, thanks for reply, REGEX_EXTRACT looks pretty useful. But unfortunately I am not that good in regex.
can you please give one example what will be regex here to extract data time part. Thanks again. Meenal On Mon, Jun 27, 2011 at 5:53 AM, Jonathan Holloway < [email protected]> wrote: > Take a look at: > > REGEX_EXTRACT - > http://pig.apache.org/docs/r0.8.0/piglatin_ref2.html#REGEX_EXTRACT > > and REGEX_EXTRACT_ALL: > > http://pig.apache.org/docs/r0.8.0/piglatin_ref2.html#REGEX_EXTRACT_ALL > > You could also use SUBSTRING, but I think a regex would be more applicable > here for date/time extraction. > > Cheers, > Jon. > > On 27 June 2011 08:49, abh not <[email protected]> wrote: > > > Hi All, > > > > I have few sample log: > > > > 139.12.0.2 - - [10/Apr/2007:10:40:54 +0300] "GET /favicon.ico HTTP/1.1" > > 200 766 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) > > Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)" > > > > If load this file in as string > > > > a = load '/user/sample/log.txt' using PigStorage('/t') as (text: > > chararray); > > > > then how can I extract a part of string from it, for example if I want to > > extract date '10/Apr/2007:10:40:54' from it, Then can I achieve this > thing > > using Pig script? > > > > Any help or suggestions are welcome. > > > > Thanks in advance. > > > > Meenal > > >
