[Pig Wiki] Update of "PigOverview" by FlipKromer

Apache Wiki Tue, 02 Dec 2008 20:27:53 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.


The following page has been changed by FlipKromer:
http://wiki.apache.org/pig/PigOverview

The comment on the change is:
Pointed to Run and Build pages (DRY)

------------------------------------------------------------------------------
  
   * Additionally, users can incorporate their own custom code into essentially 
any Pig Latin operation. For example, if a user has a function that determines 
whether a given image contains a human face, the user can ask Pig to filter 
images according to this function. Pig will then evaluate this function on the 
user's behalf, over the images. If the evaluation mechanism incorporates 
parallelism, as is the case with the Hadoop evaluation mechanism, then the 
user's function will be executed in a parallel fashion.
  
- == How to run Pig: ==
+ == Before you start ==
  
+ Make sure you [BuildPig] (or download it) and then [RunPig].  [RunPig] will 
help you check your configuration and run a silly little task. Then come back 
here to learn how to use pig.
- You have three options:
-  * Interactive shell, called ''Grunt''
-  * Script file
-  * Embed in a host language; currently we support Java as the host language 
(embedding Pig Latin in Java is very similar to JDBC)
  
  == Data formats and models: ==
  
@@ -113, +110 @@

   * Do simple processing (e.g., count the number of images on the web)
   * ... or complex processing (e.g., count the number of images that contain 
faces).
  
- 
+  * It's especially right for you if you have access to a many-computer 
cluster already running hadoop.
  
  '''Pig is not right for you''' if you:
   * Need to retrieve individual records, or small ranges of records, from a 
very large data set (e.g., lookup Joe Smith's customer profile).

[Pig Wiki] Update of "PigOverview" by FlipKromer

Reply via email to