Thank you very much Ruslan! That works well! Greetings, Johannes
Am 04.07.2012 15:53, schrieb Ruslan Al-Fakikh: > Hi Johannes, > > Try this > C = LOAD 'in.dat' AS (A1); > A = LOAD 'in2.dat' AS (A1); > > joined = JOIN A BY A1 LEFT OUTER, C BY A1; > > DESCRIBE joined; > > newEntries = FILTER joined BY C::A1 IS NULL; > > DUMP newEntries; > > Ruslan > > On Wed, Jul 4, 2012 at 4:42 PM, Johannes Schwenk > <[email protected]> wrote: >> Hi Alan, >> >> I'd like to use this method to not include records in my output that are >> already present in previously computed data. So I tried to use your >> suggestion like this: >> >> grunt> cat in.dat >> 1 >> 2 >> 3 >> 4 >> 5 >> 6 >> 7 >> 8 >> 9 >> grunt> C = LOAD 'in.dat' AS (A1); -- previously generated data >> grunt> cat in2.dat >> 12 >> 2 >> 13 >> 1 >> 10 >> 9 >> 11 >> 8 >> grunt> A = LOAD 'in2.dat' AS (A1); -- new data >> grunt> B1 = join A by A1, C by A1; >> grunt> B2 = filter B1 by SIZE(C) == 0; >> >> Which gives me this error: >> >> 2012-07-04 14:36:16,768 [main] ERROR org.apache.pig.tools.grunt.Grunt - >> ERROR 1200: Pig script failed to parse: >> <line 14, column 23> Invalid scalar projection: C : A column needs to be >> projected from a relation for it to be used as a scalar >> Details at logfile: /home/schwenk/pig-0.10.0/pig_1341403702015.log >> >> The relevant pig stack trace from the logfile can be found at >> >> http://pastebin.com/MxPfduWS >> >> What am I doing wrong? >> >> Greetings, >> Johannes >> >> Am 25.06.2012 18:39, schrieb Alan Gates: >>> This type of in is really a semi-join. So you could rewrite this as: >>> >>> B1 = join A by A1, C by A1; >>> B2 = filter B1 by SIZE(C) > 0; >>> B = foreach B2 flatten(A); >>> >>> Alan. >>> >>> On Jun 25, 2012, at 2:50 AM, yonghu wrote: >>> >>>> Dear all, >>>> >>>> in the sql, there is a in clause which is used to check if the value >>>> is in a set or not? Does pig also have the same in clause? Such as: >>>> >>>> B = filter A by A1 in C; >>>> >>>> A,B,C are relation names and A1 is a column_name of A. >>>> >>>> Thanks! >>>> >>>> Yong >>> >> >> >> >> Johannes Schwenk >> >> -- >> Softwareentwickler (Reporting) >> ________________________________________________________ >> >> ADITION technologies AG >> Schwarzwaldstraße 78b >> 79117 Freiburg >> >> http://www.adition.com >> >> T +49 / (0)761 / 88147 - 30 >> F +49 / (0)761 / 88147 - 77 >> SUPPORT +49 / (0)1805 - ADITION >> >> (Festnetzpreis 14 ct/min; Mobilfunkpreise maximal 42 ct/min) >> >> Eingetragen beim Amtsgericht Düsseldorf unter HRB 54076 >> Vorstände: Andreas Kleiser, Jörg Klekamp, Tihomir Perkovic, Marcus Schlüter >> Aufsichtsratsvorsitzender: Rechtsanwalt Daniel Raimer >> UStIDNr.: DE 218 858 434 >> >> >> Johannes Schwenk -- Softwareentwickler (Reporting) ________________________________________________________ ADITION technologies AG Schwarzwaldstraße 78b 79117 Freiburg http://www.adition.com T +49 / (0)761 / 88147 - 30 F +49 / (0)761 / 88147 - 77 SUPPORT +49 / (0)1805 - ADITION (Festnetzpreis 14 ct/min; Mobilfunkpreise maximal 42 ct/min) Eingetragen beim Amtsgericht Düsseldorf unter HRB 54076 Vorstände: Andreas Kleiser, Jörg Klekamp, Tihomir Perkovic, Marcus Schlüter Aufsichtsratsvorsitzender: Rechtsanwalt Daniel Raimer UStIDNr.: DE 218 858 434
signature.asc
Description: OpenPGP digital signature
