Use .limit on the dataframe followed by .write

On Apr 14, 2019, 5:10 AM, at 5:10 AM, Chetan Khatri 
<chetan.opensou...@gmail.com> wrote:
>Nuthan,
>
>Thank you for reply. the solution proposed will give everything. for me
>is
>like one Dataframe show(100) in 3000 lines of Scala Spark code.
>However, yarn logs --applicationId <APPLICAION_ID> > 1.log also gives
>all
>stdout and stderr.
>
>Thanks
>
>On Sun, Apr 14, 2019 at 10:30 AM Nuthan Reddy
><nut...@sigmoidanalytics.com>
>wrote:
>
>> Hi Chetan,
>>
>> You can use
>>
>> spark-submit showDF.py | hadoop fs -put - showDF.txt
>>
>> showDF.py:
>>
>> from pyspark.sql import SparkSession
>>
>>
>> spark = SparkSession.builder.appName("Write stdout").getOrCreate()
>>
>> spark.sparkContext.setLogLevel("OFF")
>>
>>
>> spark.table("<yourdf>").show(100,truncate=false)
>>
>> But is there any specific reason you want to write it to hdfs? Is
>this for
>> human consumption?
>>
>> Regards,
>> Nuthan
>>
>> On Sat, Apr 13, 2019 at 6:41 PM Chetan Khatri
><chetan.opensou...@gmail.com>
>> wrote:
>>
>>> Hello Users,
>>>
>>> In spark when I have a DataFrame and do  .show(100) the output which
>gets
>>> printed, I wants to save as it is content to txt file in HDFS.
>>>
>>> How can I do this?
>>>
>>> Thanks
>>>
>>
>>
>> --
>> Nuthan Reddy
>> Sigmoid Analytics
>>
>>
>> *Disclaimer*: This is not a mass e-mail and my intention here is
>purely
>> from a business perspective, and not to spam or encroach your
>privacy. I am
>> writing with a specific agenda to build a personal business
>connection.
>> Being a reputed and genuine organization, Sigmoid respects the
>digital
>> security of every prospect and tries to comply with GDPR and other
>regional
>> laws. Please let us know if you feel otherwise and we will rectify
>the
>> misunderstanding and adhere to comply in the future. In case we have
>missed
>> any of the compliance, it is completely unintentional.
>>

Reply via email to