I have a dataframe as a reference table for IP frequencies.
e.g.,

ip                       freq
10.226.93.67         1
10.226.93.69         1
161.168.251.101   4
10.236.70.2           1
161.168.251.105 14


All I need is to query the df in a map.

rdd = sc.parallelize(['208.51.22.18', '31.207.6.173', '208.51.22.18'])

freqs = rdd.map(lambda x: df.where(df.ip ==x ).first())

It doesn't get through.. would appreciate any help.

Thanks!
Ping




-- 
Ping Yan
Ph.D. in Management
Dept. of Management Information Systems
University of Arizona
Tucson, AZ 85721

Reply via email to