Hi All, I am attempting to develop some unit tests for a program using pyspark and scikit-learn and I've come across some weird behavior. I receive the following warning during some tests "python/pyspark/serializers.py:327: DeprecationWarning: integer argument expected, got float".
Although it's only a warning, and my test still passes (i.e. Spark still seems to work), it would be nice to know why it's happening and if it actually indicates a problem since this can probably happen outside unit testing as well. Note that the warning occurs when I invoke the test as "SPARK_HOME=/home/spark/spark-1.0.0-bin-hadoop1 PYTHONPATH=/home/spark/spark-1.0.0-bin-hadoop1/python python -m unittest -v -b crash_test". Doing any one of the following three things causes the warning to go away: -invoking as "python crash_test.py" rather than "python -m unittest -v -b crash_test" -commenting out "import sklearn.metrics" -changing "lambda x: foo(x)" to "lambda x: x" Note that I am running the following software: Spark 1.0.0 Python 2.7.3 scikit-learn 0.14.1 Ubuntu 12.04 *Exact Warning (actually occurs 3 times):* /home/spark/spark-1.0.0-bin-hadoop1/python/pyspark/serializers.py:327: DeprecationWarning: integer argument expected, got float stream.write(struct.pack("!q", value)) /home/spark/spark-1.0.0-bin-hadoop1/python/pyspark/serializers.py:327: DeprecationWarning: integer argument expected, got float stream.write(struct.pack("!q", value)) /home/spark/spark-1.0.0-bin-hadoop1/python/pyspark/serializers.py:327: DeprecationWarning: integer argument expected, got float stream.write(struct.pack("!q", value)) *crash_test.py:* import unittest from pyspark import SparkContext import sklearn.metrics def foo(x): return x def setUpModule(): global sc sc = SparkContext('local') print sc.parallelize(range(4)).map(lambda x: foo(x)).collect() class CrashTest(unittest.TestCase): def test(self): pass if __name__ == '__main__': unittest.main() I'm glad to know if anybody else has experienced a similar problem, or has insight into what may be happening or if it is significant. best, -Brad