Re: Spark-SQL - Query Hanging, How To Troubleshoot

Patrick Tucci Thu, 10 Aug 2023 12:03:48 -0700

Hi Mich,

Thanks for the reply. Unfortunately I don't have Hive set up on my cluster.
I can explore this if there are no other ways to troubleshoot.


I'm using beeline to run commands against the Thrift server. Here's the
command I use:

~/spark/bin/beeline -u jdbc:hive2://10.0.50.1:10000 -n hadoop -f command.sql

Thanks again for your help.

Patrick


On Thu, Aug 10, 2023 at 2:24 PM Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Can you run this sql query through hive itself?
>
> Are you using this command or similar for your thrift server?
>
> beeline -u jdbc:hive2://<hostname>/10000/default
> org.apache.hive.jdbc.HiveDriver -n hadoop -p xxx
>
> HTH
>
> Mich Talebzadeh,
> Solutions Architect/Engineering Lead
> London
> United Kingdom
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Thu, 10 Aug 2023 at 18:39, Patrick Tucci <patrick.tu...@gmail.com>
> wrote:
>
>> Hello,
>>
>> I'm attempting to run a query on Spark 3.4.0 through the Spark
>> ThriftServer. The cluster has 64 cores, 250GB RAM, and operates in
>> standalone mode using HDFS for storage.
>>
>> The query is as follows:
>>
>> SELECT ME.*, MB.BenefitID
>> FROM MemberEnrollment ME
>> JOIN MemberBenefits MB
>> ON ME.ID = MB.EnrollmentID
>> WHERE MB.BenefitID = 5
>> LIMIT 10
>>
>> The tables are defined as follows:
>>
>> -- Contains about 3M rows
>> CREATE TABLE MemberEnrollment
>> (
>>     ID INT
>>     , MemberID VARCHAR(50)
>>     , StartDate DATE
>>     , EndDate DATE
>>     -- Other columns, but these are the most important
>> ) STORED AS ORC;
>>
>> -- Contains about 25m rows
>> CREATE TABLE MemberBenefits
>> (
>>     EnrollmentID INT
>>     , BenefitID INT
>> ) STORED AS ORC;
>>
>> When I execute the query, it runs a single broadcast exchange stage,
>> which completes after a few seconds. Then everything just hangs. The
>> JDBC/ODBC tab in the UI shows the query state as COMPILED, but no stages or
>> tasks are executing or pending:
>>
>> [image: image.png]
>>
>> I've let the query run for as long as 30 minutes with no additional
>> stages, progress, or errors. I'm not sure where to start troubleshooting.
>>
>> Thanks for your help,
>>
>> Patrick
>>
>

Re: Spark-SQL - Query Hanging, How To Troubleshoot

Reply via email to