The dataset was using a sparse representation before feeding into
LogisticRegression.

On Tue, Apr 23, 2019 at 3:15 PM Weichen Xu <weichen...@databricks.com>
wrote:

> Hi Qian,
>
> Do your dataset use sparse vector format ?
>
>
>
> On Mon, Apr 22, 2019 at 5:03 PM Qian He <hq.ja...@gmail.com> wrote:
>
>> Hi all,
>>
>> I'm using Spark provided LogisticRegression to fit a dataset. Each row of
>> the data has 1.7 million columns, but it is sparse with only hundreds of
>> 1s. The Spark Ui reported high GC time when the model is being trained. And
>> my spark application got stuck without any response. I have allocated 100
>> executors and 8g for each executor.
>>
>> Is there any thing i should do to make the training process go
>> successfully?
>>
>

Reply via email to