[jira] [Commented] (IMPALA-12745) dump_breakpad_symbols.py's parallelism doesn't work with RPM/DEBs

2024-01-29 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812089#comment-17812089
 ] 

Joe McDonnell commented on IMPALA-12745:


Just realized that I got the Jira wrong: This was introduced by IMPALA-10048, 
not IMPALA-11511.

> dump_breakpad_symbols.py's parallelism doesn't work with RPM/DEBs
> -
>
> Key: IMPALA-12745
> URL: https://issues.apache.org/jira/browse/IMPALA-12745
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 4.4.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Critical
> Fix For: Impala 4.4.0
>
>
> When using the "-r" or "--pkg" option, dump_breakpad_symbols.py is extracting 
> the RPM/DEB into a temporary directory. The lifetime of that temporary 
> directory is maintained by yielding tuples from enumerate_pkg_files(). When 
> using parallelism (added in IMPALA-11511), the yield doesn't keep the 
> temporary directory around while the parallel threads are processing, so they 
> fail with:
>  
> {noformat}
> Found debugging info in 
> /tmp/tmpqfZ9MZ/usr/lib/debug/usr/lib/impala/sbin-retail/impalad.debug
> Failed to open ELF file 
> '/tmp/tmpqfZ9MZ/usr/lib/debug/usr/lib/impala/sbin-retail/impalad.debug': No 
> such file or directory
> Failed to write symbol file.
> {noformat}
>  
> Testing shows that this is still a problem with num_processes=1, so there 
> should also be a change to be able to turn off the ThreadPool entirely. 
> Processing OS packages can force the parallelism off for now as they don't 
> benefit much from parallelism.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12745) dump_breakpad_symbols.py's parallelism doesn't work with RPM/DEBs

2024-01-29 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812085#comment-17812085
 ] 

ASF subversion and git services commented on IMPALA-12745:
--

Commit 41a3f4d4ca43092d0ef48eeaa765626b720e986c in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=41a3f4d4c ]

IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages

When using bin/dump_breakpad_symbols.py to dump symbols for RPM/DEB
packages, the script extracts the packages to a temporary directory
and relies on keeping that directory around until the processing
is finished. The parallel processing added in IMPALA-11511 breaks
the logic that keeps the temporary directory around, so the script
generates errors like:

Found debugging info in 
/tmp/tmpqfZ9MZ/usr/lib/debug/usr/lib/impala/sbin-retail/impalad.debug
Failed to open ELF file 
'/tmp/tmpqfZ9MZ/usr/lib/debug/usr/lib/impala/sbin-retail/impalad.debug': No 
such file or directory
Failed to write symbol file.

This turns off parallelism for bin/dump_breakpad_symbols.py when
processing RPM/DEB packages (i.e. -r/--pkg). This also avoids using
a ThreadPool when num_processes <= 1.

Testing:
 - Hand tested with Redhat 7 RPMs

Change-Id: If2885a9cfb36a4f616b539599e7f744bd23552c3
Reviewed-on: http://gerrit.cloudera.org:8080/20943
Reviewed-by: Impala Public Jenkins 
Tested-by: Joe McDonnell 


> dump_breakpad_symbols.py's parallelism doesn't work with RPM/DEBs
> -
>
> Key: IMPALA-12745
> URL: https://issues.apache.org/jira/browse/IMPALA-12745
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 4.4.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Critical
>
> When using the "-r" or "--pkg" option, dump_breakpad_symbols.py is extracting 
> the RPM/DEB into a temporary directory. The lifetime of that temporary 
> directory is maintained by yielding tuples from enumerate_pkg_files(). When 
> using parallelism (added in IMPALA-11511), the yield doesn't keep the 
> temporary directory around while the parallel threads are processing, so they 
> fail with:
>  
> {noformat}
> Found debugging info in 
> /tmp/tmpqfZ9MZ/usr/lib/debug/usr/lib/impala/sbin-retail/impalad.debug
> Failed to open ELF file 
> '/tmp/tmpqfZ9MZ/usr/lib/debug/usr/lib/impala/sbin-retail/impalad.debug': No 
> such file or directory
> Failed to write symbol file.
> {noformat}
>  
> Testing shows that this is still a problem with num_processes=1, so there 
> should also be a change to be able to turn off the ThreadPool entirely. 
> Processing OS packages can force the parallelism off for now as they don't 
> benefit much from parallelism.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org