Here I am adding my code. If you can have a look to help me out.
Thanks
#######################

import tokenizer
import gettingWordLists as gl
from pyspark.mllib.classification import NaiveBayes
from numpy import array
from pyspark import SparkContext, SparkConf

conf = (SparkConf().setMaster("local[6]").setAppName("My
app").set("spark.executor.memory", "1g"))

sc=SparkContext(conf = conf)
# Getting the positive dict:

pos_list = []
pos_list = gl.getPositiveList()
neg_list = gl.getNegativeList()

#print neg_list
tok = tokenizer.Tokenizer(preserve_case=False)
train_data  = []

with open("training_file_coach.csv","r") as train_file:
    for line in train_file:
        tokens = line.split("######")
        msg = tokens[0]
        sentiment = tokens[1]
        pos_count = 0
        neg_count = 0
#        print sentiment + "\n\n"
#        print msg
        tokens = set(tok.tokenize(msg))
        for i in tokens:
            if i.encode('utf-8') in pos_list:
                pos_count+=1
            if i.encode('utf-8') in neg_list:
                neg_count+=1
        if sentiment.__contains__('NEG'):
            label = 0.0
        else:
            label = 1.0

        feature = []
        feature.append(label)
        feature.append(float(pos_count))
        feature.append(float(neg_count))
        train_data.append(feature)
    train_file.close()

model = NaiveBayes.train(sc.parallelize(array(train_data)))


file_predicted = open("predicted_file_coach.csv","w")

with open("prediction_file_coach.csv","r") as predict_file:
    for line in predict_file:
        msg = line[0:-1]
        pos_count = 0
        neg_count = 0
#        print sentiment + "\n\n"
#        print msg
        tokens = set(tok.tokenize(msg))
        for i in tokens:
            if i.encode('utf-8') in pos_list:
                pos_count+=1
            if i.encode('utf-8') in neg_list:
                neg_count+=1
        prediction =
model.predict(array([float(pos_count),float(neg_count)]))
        if prediction == 0:
            sentiment = "NEG"
        elif prediction == 1:
            sentiment = "POS"
        else:
            print "ERROR\n\n\n\n\n\n\nERROR"

        feature = []
        feature.append(float(prediction))
        feature.append(float(pos_count))
        feature.append(float(neg_count))
        print feature
        train_data.append(feature)
        model = NaiveBayes.train(sc.parallelize(array(train_data)))
        file_predicted.write(msg + "######" + sentiment + "\n")

file_predicted.close()
###################

If you can have a look at the code and help me out, It would be great

Thanks


On Wed, Jul 9, 2014 at 12:54 AM, Rahul Bhojwani <rahulbhojwani2...@gmail.com
> wrote:

> Hi Marcelo.
> Thanks for the quick reply. Can you suggest me how to increase the memory
> limits or how to tackle this problem. I am a novice. If you want I can post
> my code here.
>
>
> Thanks
>
>
> On Wed, Jul 9, 2014 at 12:50 AM, Marcelo Vanzin <van...@cloudera.com>
> wrote:
>
>> This is generally a side effect of your executor being killed. For
>> example, Yarn will do that if you're going over the requested memory
>> limits.
>>
>> On Tue, Jul 8, 2014 at 12:17 PM, Rahul Bhojwani
>> <rahulbhojwani2...@gmail.com> wrote:
>> > HI,
>> >
>> > I am getting this error. Can anyone help out to explain why is this
>> error
>> > coming.
>> >
>> > ########
>> >
>> > Exception in thread "delete Spark temp dir
>> >
>> C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560"
>> >  java.io.IOException: Failed to delete:
>> >
>> C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560\tmp
>> > cmenlp
>> >         at
>> org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:483)
>> >         at
>> >
>> org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:479)
>> >         at
>> >
>> org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:478)
>> >         at
>> >
>> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>> >         at
>> > scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
>> >         at
>> org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:478)
>> >         at org.apache.spark.util.Utils$$anon$4.run(Utils.scala:212)
>> > PS>
>> > ############
>> >
>> >
>> >
>> >
>> > Thanks in advance
>> > --
>> > Rahul K Bhojwani
>> > 3rd Year B.Tech
>> > Computer Science and Engineering
>> > National Institute of Technology, Karnataka
>>
>>
>>
>> --
>> Marcelo
>>
>
>
>
> --
> Rahul K Bhojwani
> 3rd Year B.Tech
> Computer Science and Engineering
> National Institute of Technology, Karnataka
>



-- 
Rahul K Bhojwani
3rd Year B.Tech
Computer Science and Engineering
National Institute of Technology, Karnataka

Reply via email to