Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The following page has been changed by AmirYoussefi:
http://wiki.apache.org/pig/FAQ

New page:
---+!! PigFAQ

---++++ 1. I'm using PigStorage to parse my input files. Can I make it use 
control characters as delimiters?

A. Yes. Examples: PigStorage('\u0001') for Ctrl+A or '\u007C' for this 
character: |


---++++2. Can I do a numerical comparison while filtering?

A. Yes, you can choose between numerical and string comparison. For numerical 
comparison use the operators =, <>, <  etc. and for string comparisons use eq, 
neq etc. 

---++++3. How do I make my jobs run on multiple machines?

A. Use the PARALLEL clause. For example =C = JOIN A by url, B by url PARALLEL 
50=

---++++4. Does Pig support NULLs?

A. Pig currently has no support for NULL values but it is on the roadmap.

---++++5. Does pig support regular expressions?

A. Pig does support regular expression matching via =matches= keyward. Tt uses 
java.util.regexp matches which means your pattern has to match the entire 
string (ie if your string is "hi fred" and you want to find "fred" you have to 
give a pattern of ".*fred" not "fred").

---++++6. How to prevent failure if some records don't have the needed number 
of columns.

You can filter away those records by including the following in your Pig 
program:

<verbatim>
A = load 'foo' using PigStorage('\t');
B = FILTER A BY ARITY(*) < 5;
.....
</verbatim>

This code would drop all the records that has less than 5 columns.

---++++7. Is there any difference between == and eq for numeric comparisons?

For equality, there is no difference while you stay in integers. However 11.0 
and 11 will be equal with == but not with eq. 

---++++8. Is there an easy way for me to figure out how many rows exists in a 
dataset from its alias?

You can run the following set of commands:

<verbatim>
a = load 'bla' ... ;
b = group a all;
c = foreach b generate COUNT(a.$0);
</verbatim>

This is equivalent to select count(*) in SQL.

---++++9. Does Pig allow grouping on expressions

Currently, Pig only allows to group on data fields rather than expressions. 
Allowing grouping on expressions is on our road map. Stay tuned!

Reply via email to