I have a dataframe as a reference table for IP frequencies. e.g., ip freq 10.226.93.67 1 10.226.93.69 1 161.168.251.101 4 10.236.70.2 1 161.168.251.105 14
All I need is to query the df in a map. rdd = sc.parallelize(['208.51.22.18', '31.207.6.173', '208.51.22.18']) freqs = rdd.map(lambda x: df.where(df.ip ==x ).first()) It doesn't get through.. would appreciate any help. Thanks! Ping -- Ping Yan Ph.D. in Management Dept. of Management Information Systems University of Arizona Tucson, AZ 85721