Hi all,
I am trying to store the results of a reduce into mongo.
I want to share the variable "collection" in the mappers.
Here's what I have so far (I'm using pymongo)
db = MongoClient()['spark_test_db']
collec = db['programs']
db = MongoClient()['spark_test_db']
*collec = db['programs']*
def mapper(val):
asc = val.encode('ascii','ignore')
json = convertToJSON(asc, indexMap)
collec.insert(json) # *this is not working*
def convertToJSON(string, indexMap):
values = string.strip().split(",")
json = {}
for i in range(len(values)):
json[indexMap[i]] = values[i]
return json
How do I do this?