Hi all, I am trying to create and train a model for a Kaggle competition dataset using Apache spark. The dataset has more than 10 million rows of data. But when training the model, I get an exception "*Size exceeds Integer.MAX_VALUE*".
I found the same question has been raised in Stack overflow but those answers didn't help much. It would be a great if you could help to resolve this issue. Thanks. Minudika