Daniel Papp created HIVE-17487: ---------------------------------- Summary: Example fails on the Hive Getting started page Key: HIVE-17487 URL: https://issues.apache.org/jira/browse/HIVE-17487 Project: Hive Issue Type: Bug Reporter: Daniel Papp Priority: Trivial
There is an example on [Hive Getting Started|https://cwiki.apache.org/confluence/display/Hive/GettingStarted] page using the MovieLens100k dataset. The mapper is defined as a python script in the following way: {code} import sys import datetime for line in sys.stdin: line = line.strip() userid, movieid, rating, unixtime = line.split('\t') weekday = datetime.datetime.fromtimestamp(float(unixtime)).isoweekday() print '\t'.join([userid, movieid, rating, str(weekday)]) {code} which is correct assuming you're using the python 2 series. The following code works with both 2 and 3 series: {code} from __future__ import print_function import sys import datetime for line in sys.stdin: line = line.strip() userid, movieid, rating, unixtime = line.split('\t') weekday = datetime.datetime.fromtimestamp(float(unixtime)).isoweekday() print('\t'.join([userid, movieid, rating, str(weekday)])) {code} I think this should be corrected. -- This message was sent by Atlassian JIRA (v6.4.14#64029)