Thanks buddy !!

 

Thanks and regards,
Vikram Elango
The Home Depot, 
Nortel no: 0441-3806 

Mobile: +91-8939662345

 

From: John Omernik [mailto:j...@omernik.com] 
Sent: Friday, August 31, 2012 5:44 PM
To: user@hive.apache.org
Subject: Force number of records per map task

 

This is going to sound very odd, but I am hoping to use a transform
script in such a way that I pass a filepath to the transform script, to
which it reads the file and produces a bunch of rows in hive.  In this
case the data is pcaps.  I have a location accessible to all nodes, and
I want to have my transform script read in a file location, and then
spit out, for example the IP addresses that were seen in the packet
capture (using a script I've already written).   Can I do something
whereby I load my file locations into a table in hive (one file per row)
and read that table into a transform script and only have one map task
per source row?  I don't want my script to parse several files, it may
make for some poor parrelelization, but I am having trouble forcing such
a small record count per map task. 

 

Thoughts? 

 

 


Confidential: This electronic message and all contents contain information from 
Syntel, Inc. which may be privileged, confidential or otherwise protected from 
disclosure. The information is intended to be for the addressee only. If you 
are not the addressee, any disclosure, copy, distribution or use of the 
contents of this message is prohibited. If you have received this electronic 
message in error, please notify the sender immediately and destroy the original 
message and all copies.

Reply via email to