That's unfortunate.  I tried a simple example to reproduce this but
was unable to do so.  That being said, I am by no means a Windows
expert.  I created a Windows share (I think this uses SMB under the
hood?) on one server.  I then mapped a drive to this share from my
test system (a separate Windows server).  I then ran the following
script with pyarrow 6.0.0 and everything worked correctly:

import pyarrow.parquet as pq
import pyarrow as pa
tab = pa.Table.from_pydict({'x': [1, 2, 3]})
pq.write_table(tab, 'Z:/foo.parquet')
pq.read_table('Z:/foo.parquet')

Can you try this simple test and see if it works?
Do you get a hang on other versions of pyarrow or is it only 6.0.0?
Do you have any other details that may help reproduce the issue?
Do you know what kind of technology the shared drive is using (e.g. is
it SMB or NFS or...)?

If you want it seems these instructions[1] will allow you to generate
a sort of core dump on Windows.  Can you try out the process "To
create a dump file for a hanging process" and send the created dump
file (keep in mind this file may contain information about your
hardware, server name, etc.)

[1] 
https://docs.microsoft.com/en-US/troubleshoot/windows-server/performance/use-userdump-create-dump-file#to-create-a-dump-file-for-a-hanging-process

On Thu, Nov 11, 2021 at 10:48 PM Farhad Taebi
<[email protected]> wrote:
>
> Hi,
>
> I'm trying to read parquet files which are located on a windows network drive 
> using pyarrow 6.0.0.
>
> The drive is mounted on my system so I can access the files under 
> H:\my\parquet\files. The problem is that when I execute
>
> pq.read_table("H:\\my\\parquet\\files")
>
> the process just hangs forever. Python is not doing anything (no cpu or ram 
> is used). I have to kill the python process manually. This doesn't happen 
> with files located on my local drive. What am I doing wrong? What can I do to 
> solve this problem?
>
> Thanks a lot
>
> FT

Reply via email to