Kludex commented on issue #33176:
URL: https://github.com/apache/beam/issues/33176#issuecomment-2538955701
Okay... Hi there again... 👋
Let me try to explain where I stand now, and I'm actually going to stop on
my own, to ask advice.
I've been trying to make it work in Dataflow. This is the pipeline I have in
hands (for testing purposes):
```py
class Split(beam.DoFn):
def process(self, element: str):
logfire.info(f"in split {element}")
return element.split(" ")
def logfire_print(element: str):
with logfire.span("Print"):
logfire.info(element)
logfire.info("{globals}", globals=globals())
logfire.info("{locals}", locals=locals())
logfire.info("{traceback}", traceback=traceback.format_stack())
pipeline_options = PipelineOptions()
with Pipeline(options=pipeline_options) as pipeline:
text = [
"To be, or not to be: that is the question: ",
"Whether 'tis nobler in the mind to suffer ",
"The slings and arrows of outrageous fortune, ",
"Or to take arms against a sea of troubles, ",
]
pipeline = (
pipeline
| "Create" >> beam.Create(text)
| "Split" >> beam.ParDo(Split())
| "Filter" >> beam.Filter(lambda x: x != "the")
| "Print" >> beam.Map(logfire_print)
)
```
What we want is to have a span that starts when we run the pipeline, i.e. on
the machine that ran it:
```bash
uv run python main.py \
--region ... \
--runner DataflowRunner \
--project ... \
--temp_location ... \
--requirements_file ./requirements.txt \
--save_main_session
```
And we want to have a span being created when each step runs e.g. when
`logfire_print` is called for each element, we want to have the `logfire_print`
wrapped, and then any `logfire.info()` will be contained on this automatic span
that was created.
The problem is... Unfortunately, I can't monkeypatch anything on the worker
side, because we actually pickle the function to send it to the worker.
<img width="883" alt="Screenshot 2024-12-12 at 14 28 07"
src="https://github.com/user-attachments/assets/11a9758d-930c-4ba8-b9e3-aba9e64b16d7"
/>
This is what I got ☝️
I wanted `Print` to be inside the first big span...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]