Hi Khalid,

just out of curiosity, does the API help us in setting JOB ID's or just job
Descriptions?

Regards,
Gourav Sengupta

On Wed, Dec 28, 2022 at 10:58 AM Khalid Mammadov <khalidmammad...@gmail.com>
wrote:

> There is a feature in SparkContext to set localProperties
> (setLocalProperty) where you can set your Request ID and then using
> SparkListener instance read that ID with Job ID using onJobStart event.
>
> Hope this helps.
>
> On Tue, 27 Dec 2022, 13:04 Dhruv Toshniwal,
> <dhruv.toshni...@mindtickle.com.invalid> wrote:
>
>> TL;Dr -
>> how-to-map-external-request-ids-to-spark-job-ids-for-spark-instrumentation
>> <https://stackoverflow.com/questions/74794579/how-to-map-external-request-ids-to-spark-job-ids-for-spark-instrumentation>
>>
>> Hi team,
>>
>> We are the engineering team of Mindtickle Inc. and we have a use-case
>> where we want to store a map of request Ids (unique API call ID) to Spark
>> Job Ids. Architecturally, we have created a system where our users use
>> various Analytics tools on the frontend which in turn run Spark Jobs
>> internally and then serve computed data back to them. We receive various
>> API calls from upstream and serve it via Apache Spark computing on the
>> backend.
>> However, as our customer base has grown, we have come to receive lots of
>> parallel requests. We have observed that Spark Jobs take different time for
>> the same API requests from upstream. Therefore, for Spark instrumentation
>> purposes we wish to maintain a map of requestID generated at our end to the
>> job IDs that Spark internally generates in relation to these requesrIDs.
>> This will enable us to go back in time via the history server or custom
>> SparkListeners to debug and improve our system. Any leads in this direction
>> would be greatly appreciated. I would love to explain our use case in
>> greater detail if required.
>>
>> Thanks and Regards,
>> Dhruv Toshniwal
>> SDE-2
>> Mindtickle Inc.
>>
>

Reply via email to