Hi all: I’m trying to use hadoop to process logs. I’ve write some routine to count the login times of the same ip. However, because my logs’ characters are hybrid encoded (ASCII, Unicode, UTF-8 etc), TextInputFormat class in hadoop will error. Do you have some good way to solve this problem?
Thank you very much!
