Public bug reported:

Summary:

I believe there’s a bug in GVFS-SMB (or its implementation in Nautilus)
that manifests when writing to SMB2 shares.  It occurs inconsistently
(but enough to be annoying) in everyday use, but it can be replicated
consistently with a simple echo command redirected to a file on the
share.  This doesn’t happen when using the ‘mount’ command to connect to
SMB2 shares.  It also doesn’t happen when making an SMB1 connection
through Nautilus OR mount.  I’ve also tested this against 2 different
SMB servers (a Synology DS216J and a share on a Win7 Pro system) and the
results are consistent so I don’t believe it’s a server side issue.

I don’t know enough about this part of the system to identify where the
issue is or fix it, but I’ve included data below to show the basis for
my assertions above.  The workaround is either going back to SMB1 or
mounting my shares with the mount command, but I was hoping someone
could take this on and fix it long-term.  As Microsoft progressively
phases out SMB1, I think this will eventually become a wider scale
problem so now would be an opportune time to address it.  If there’s any
further data I can provide, please let me know.

Also, I’m sorry this bug report is so long, but I’m erring on the side
of too much detail rather than too little.

Detail:

On my Ubuntu 18.04 system, when I attempt to echo a string into a file
on a GVFS mounted share (left hand nav of Nautilus + Other Locations)
and that share is configured to SMB1 (client side left to negotiate,
server side locked to SMB1) this works as expected.  I tested against
SMB shares on both a Synology Diskstation DS216J NAS and a Windows 7
Professional workstation. When I was on Ubuntu 16.04 that was SMB1 only
out of the box I didn’t know this problem even existed.  Now that I’m on
Ubuntu 18.04 and my share was willing to make connection at SMB2 I
started to have weird, inconsistent disk problems.  Much troubleshooting
and analysis later, I can recreate this issue consistently with a simple
echo command to a file on the share.

To the SMB share on a Synology Diskstation NAS set to SMB1:

echo "This is a test." > /run/user/1000/gvfs/smb-
share\:server\=diskstation\,share\=fileshare/Test-File-SMB1-GVFS-NAS.txt

The command executes with no feedback (other than the return of the
prompt awaiting the next command) and the file of the name specified
with the contents specified appears at the root of the NAS file share.

To the SMB share on a Windows 7 Professional workstation set to SMB1:

echo "This is a test." > /run/user/1000/gvfs/smb-
share\:server\=hp_laptop\,share\=videos/Test-File-SMB1-GVFS-Win7.txt

The command also executes with no feedback (other than the return of the
prompt awaiting the next command) and the file of the name specified
with the contents specified appears at the root of the “videos” file
share on the Win7 workstation.

I also took PCAPs of this occurring (attached).  What they show is about
what I would expect for this operation.  Client SMB pings the server in
packet 1 who responds in packet 2, queries info on the folder in packet
4 and gets some basic info back in packet 5, queries the filename (to
make sure it doesn’t already exist) several times (not sure why), each
time getting a name not found reply in packets 6 through 17, then
creates the file in packet 18, gets back a success message in packet 19,
queries info on the new file in packet 20, gets a success response in
packet 21, writes the contents in packet 22, gets a good response in
packet 23, queries the file one more time (presumably to make sure the
new size is good) in packet 24, gets a response with the info in packet
25, requests the handle be closed in packet 26, and gets an OK to that
in packet 27.

This is all wonderful.  Perhaps slightly inefficient, but I don’t know
what’s going on under the hood, so who am I to judge?  When I switch to
SMB2 on the server side and reconnect via GVFS using Nautilus, things
fall apart.

To the SMB share on a Synology Diskstation NAS set to SMB2:

echo "This is a test." > /run/user/1000/gvfs/smb-
share\:server\=diskstation\,share\=fileshare/Test-File-SMB2-GVFS-NAS.txt

returns

bash: /run/user/1000/gvfs/smb-share:server=diskstation,share=fileshare
/Test-File-SMB2-GVFS-NAS.txt: Invalid argument

The file is where I expected it, but it is 0 bytes.

To the SMB share on a Windows 7 Professional workstation set to SMB2:

echo "This is a test." > /run/user/1000/gvfs/smb-
share\:server\=hp_laptop\,share\=videos/Test-File-SMB2-GVFS-Win7.txt

returns

bash: /run/user/1000/gvfs/smb-share:server=hp_laptop,share=videos/Test-
File-SMB2-GVFS-Win7.txt: Invalid argument

Again, the file is there in the right location, but it is 0 bytes.

Going to take a look at the PCAPs, it’s pretty clear what’s happened,
but less clear why.  The PCAPs were consistent between the NAS and the
Win7 Pro workstation.  So the client requests a file with no name
(presumably as shortcut for pwd) be opened in packet 1 and a handle was
returned in packet 2.  Info about this handle was requested in packet 3
and returned in packet 4 (filename is \) followed by a request to close
that handle in packet 5 and a success response in packet 6.  I assume up
until now, it was just overhead to make sure the share worked properly.
Then 4 repeated requests of disposition 1 (open if exists, else fail)
for the filename I echoed to with the related (and expected) name not
found errors returned in packets 7 through 14.  Then finally a
disposition 3 request (create; fail if file already exists) for the
filename specified in packet 15 followed by a success message in packet
16.  This is the point where things start to go wrong.  We then have 4
repeated file_all_info requests each followed by an access denied in
packets 17 through 24.  Of course, one looks at this as says WTF?  You
JUST created the file (and the 0 byte file does, indeed, appear in the
location requested) so why would a request for file info return an
access denied?  I checked the file handles and they appear consistent.
Unfortunately, I don’t know enough about the inner workings of SMB to
see if there was anything incorrect about the file create in packet 15
or the response in packet 16, but that seems like a good place to start
looking.  After the access denied packets, the client must get the hint
and requests a close of the file handle in packet 25 and gets an OK from
the server to that in packet 26.

So, I stumbled around with this problem for a while comparing SMB1 with
SMB2 (e.g. apples and oranges or, I guess more accurately, oranges and
grapefruits) until someone in the forums suggested using the mount
command instead of GVFS / Nautilus.  I did that and things worked just
fine.

To the SMB share on a Synology Diskstation NAS set to SMB2:

sudo mount -t cifs //diskstation/fileshare /media/NAS -o
username=sawozny,vers=2.0,uid=1000

echo "This is a test." > /media/NAS/Test-File-SMB2-MOUNT-NAS.txt

The command executes with no feedback (other than the return of the
prompt awaiting the next command) and the file of the name specified
with the contents specified appears at the root of the NAS file share.

To the SMB share on a Windows 7 Professional workstation set to SMB2:

sudo mount -t cifs //hp_laptop/videos /media/Win7 -o
username=sawozny,vers=2.0,uid=1000

echo "This is a test." > /media/Win7/Test-File-SMB2-MOUNT-Win7.txt

The command also executes with no feedback (other than the return of the
prompt awaiting the next command) and the file of the name specified
with the contents specified appears at the root of the “videos” file
share on the Win7 workstation.

Note: I know I shouldn’t need to specify SMB dialect when doing the
mount because the server is set to only make an SMB2 connection, but
mounting doesn’t work for me if I don’t specify the version number.
Maybe I’m doing something wrong here, but since the point was to make an
SMB2 connection and I did I’m not burning a lot of cycles on why I
needed to use an additional parameter that shuoldn’t have been needed.

Digging into the PCAPs (attached), it’s clear that what’s going on under
the hood is very different.  The PCAPs both appear identical, so there
do not appear to be any server side differences affecting the testing.
The PCAP starts right out of the gate with a disposition 5 (overwrite if
the file already exists; otherwise, create it) request of the correct
name in packet 1 and a handle returned in packet 2, then a file internal
info request in packet 4 and a response in packet 5, followed by a set
EOF info in packet 6 with a success response in packet 7.  Next a
disposition 1 (open if file exists, else fail) request in packet 8
followed by a success return in packet 9 followed by a set info setting
the last write and last change dates in packet 10, followed by a success
return in packet 11 and a close of the handle opened in packet 9 in
packet 12 with a success return in packet 13, then the actual data write
in packet 14 followed by a success return in packet 15, a close handle
(from packet 2) in packet 16 and a successful return of that request in
packet 17.

And the file is where it’s supposed to be with the correct contents on
an SMB2 share.

Unfortunately, this doesn’t really help pin down the nature of the
issue.  If the SMB2 conversations were the same to a certain point and
diverged it would be easier to point to a spot and say “look here” but
like everything else in computing there’s more than one way to do almost
any function.  If we were to focus on the actual moment of file
creation, then in the GVFS SMB2 PCAPs there is no OpLock (0) vs batch
OpLock (9) in the MOUNT PCAP, also the Access Mask in the GVFS PCAP is
0x00120116 vs 0x40000080 in the MOUNT PCAP, the Share Access is
0x0000003 GVFS vs 0x0000007 MOUNT, the disposition of the create is
different as described and there is an
SMB2_CREATE_DURABLE_HANDLE_REQUEST in the MOUNT PCAP that is not in the
GVFS PCAP, but I don’t know if anything that happened before or after
the moment of file creation in both the failed and successful PCAPs
affected that moment, making those differences make sense.  So all the
PCAPs are attached for your review to see what else might be making the
difference.

Finally, per the instructions at
https://wiki.gnome.org/Projects/gvfs/debugging I ran 2 debugs of this
issue.  One was with the base instructions for a debug and the other was
with GVFS_SMB_DEBUG set to 10.  These were not done at the same time as
the PCAPs to avoid Heisenberg differences, but the mount steps and echo
command were the same, so if you follow the latter part of the debug
(the packet captures only have the actual transfer where the debug log
has to have everything from the launch of Nautilus) it should be pretty
easy to match up what’s happening.  Unfortunately (AGAIN) not much
additional info seems to be offered.  The failed requests for the file
name show up (2 “No such directory or file” returns in the log, rather
than the 4 STATUS_OBJECT_NAME_NOT_FOUND packets in the PCAP; not sure
why the discrepancy) but the write returns no error code in the log,
despite the fact that in the PCAP it doesn’t appear to even have gotten
to the stage of writing.  Perhaps the repeated STATUS_ACCESS_DENIED in
the PCAP were part of that stage in the process, but a failure at that
point wasn’t something GVFS-SMB knew how to handle so it didn’t return
an error code on that function call (but also didn’t do the actual
writing).  The only other thing I can find in that file of interest is a
GvfsJobCreateMonitor call that returned “Operation not supported by
backend” but in context it seems to be for the creation of a directory
monitor which doesn’t seem to be that life threatening a failure.
Anyway, I’m just guessing at this point so I included the logs in case
there’s something important in there I missed.

If you’ve made it this far, I hope this has been useful to tracking down
this bug.  If there is anything I can do to help in terms of further
information gathering or testing, please let me know.

** Affects: ubuntu
     Importance: Undecided
         Status: New

** Affects: gvfs (Ubuntu)
     Importance: Undecided
         Status: New

** Attachment added: "6 PCAPs and 2 GVFS Debug Logs"
   
https://bugs.launchpad.net/bugs/1815395/+attachment/5237570/+files/GVFS-SMB-BUG-REPORT.zip

** Also affects: gvfs (Ubuntu)
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Desktop
Packages, which is subscribed to gvfs in Ubuntu.
https://bugs.launchpad.net/bugs/1815395

Title:
  GVFS-SMB write problems to SMB2 shares

Status in Ubuntu:
  New
Status in gvfs package in Ubuntu:
  New

Bug description:
  Summary:

  I believe there’s a bug in GVFS-SMB (or its implementation in
  Nautilus) that manifests when writing to SMB2 shares.  It occurs
  inconsistently (but enough to be annoying) in everyday use, but it can
  be replicated consistently with a simple echo command redirected to a
  file on the share.  This doesn’t happen when using the ‘mount’ command
  to connect to SMB2 shares.  It also doesn’t happen when making an SMB1
  connection through Nautilus OR mount.  I’ve also tested this against 2
  different SMB servers (a Synology DS216J and a share on a Win7 Pro
  system) and the results are consistent so I don’t believe it’s a
  server side issue.

  I don’t know enough about this part of the system to identify where
  the issue is or fix it, but I’ve included data below to show the basis
  for my assertions above.  The workaround is either going back to SMB1
  or mounting my shares with the mount command, but I was hoping someone
  could take this on and fix it long-term.  As Microsoft progressively
  phases out SMB1, I think this will eventually become a wider scale
  problem so now would be an opportune time to address it.  If there’s
  any further data I can provide, please let me know.

  Also, I’m sorry this bug report is so long, but I’m erring on the side
  of too much detail rather than too little.

  Detail:

  On my Ubuntu 18.04 system, when I attempt to echo a string into a file
  on a GVFS mounted share (left hand nav of Nautilus + Other Locations)
  and that share is configured to SMB1 (client side left to negotiate,
  server side locked to SMB1) this works as expected.  I tested against
  SMB shares on both a Synology Diskstation DS216J NAS and a Windows 7
  Professional workstation. When I was on Ubuntu 16.04 that was SMB1
  only out of the box I didn’t know this problem even existed.  Now that
  I’m on Ubuntu 18.04 and my share was willing to make connection at
  SMB2 I started to have weird, inconsistent disk problems.  Much
  troubleshooting and analysis later, I can recreate this issue
  consistently with a simple echo command to a file on the share.

  To the SMB share on a Synology Diskstation NAS set to SMB1:

  echo "This is a test." > /run/user/1000/gvfs/smb-
  share\:server\=diskstation\,share\=fileshare/Test-File-SMB1-GVFS-
  NAS.txt

  The command executes with no feedback (other than the return of the
  prompt awaiting the next command) and the file of the name specified
  with the contents specified appears at the root of the NAS file share.

  To the SMB share on a Windows 7 Professional workstation set to SMB1:

  echo "This is a test." > /run/user/1000/gvfs/smb-
  share\:server\=hp_laptop\,share\=videos/Test-File-SMB1-GVFS-Win7.txt

  The command also executes with no feedback (other than the return of
  the prompt awaiting the next command) and the file of the name
  specified with the contents specified appears at the root of the
  “videos” file share on the Win7 workstation.

  I also took PCAPs of this occurring (attached).  What they show is
  about what I would expect for this operation.  Client SMB pings the
  server in packet 1 who responds in packet 2, queries info on the
  folder in packet 4 and gets some basic info back in packet 5, queries
  the filename (to make sure it doesn’t already exist) several times
  (not sure why), each time getting a name not found reply in packets 6
  through 17, then creates the file in packet 18, gets back a success
  message in packet 19, queries info on the new file in packet 20, gets
  a success response in packet 21, writes the contents in packet 22,
  gets a good response in packet 23, queries the file one more time
  (presumably to make sure the new size is good) in packet 24, gets a
  response with the info in packet 25, requests the handle be closed in
  packet 26, and gets an OK to that in packet 27.

  This is all wonderful.  Perhaps slightly inefficient, but I don’t know
  what’s going on under the hood, so who am I to judge?  When I switch
  to SMB2 on the server side and reconnect via GVFS using Nautilus,
  things fall apart.

  To the SMB share on a Synology Diskstation NAS set to SMB2:

  echo "This is a test." > /run/user/1000/gvfs/smb-
  share\:server\=diskstation\,share\=fileshare/Test-File-SMB2-GVFS-
  NAS.txt

  returns

  bash: /run/user/1000/gvfs/smb-share:server=diskstation,share=fileshare
  /Test-File-SMB2-GVFS-NAS.txt: Invalid argument

  The file is where I expected it, but it is 0 bytes.

  To the SMB share on a Windows 7 Professional workstation set to SMB2:

  echo "This is a test." > /run/user/1000/gvfs/smb-
  share\:server\=hp_laptop\,share\=videos/Test-File-SMB2-GVFS-Win7.txt

  returns

  bash: /run/user/1000/gvfs/smb-share:server=hp_laptop,share=videos
  /Test-File-SMB2-GVFS-Win7.txt: Invalid argument

  Again, the file is there in the right location, but it is 0 bytes.

  Going to take a look at the PCAPs, it’s pretty clear what’s happened,
  but less clear why.  The PCAPs were consistent between the NAS and the
  Win7 Pro workstation.  So the client requests a file with no name
  (presumably as shortcut for pwd) be opened in packet 1 and a handle
  was returned in packet 2.  Info about this handle was requested in
  packet 3 and returned in packet 4 (filename is \) followed by a
  request to close that handle in packet 5 and a success response in
  packet 6.  I assume up until now, it was just overhead to make sure
  the share worked properly.  Then 4 repeated requests of disposition 1
  (open if exists, else fail) for the filename I echoed to with the
  related (and expected) name not found errors returned in packets 7
  through 14.  Then finally a disposition 3 request (create; fail if
  file already exists) for the filename specified in packet 15 followed
  by a success message in packet 16.  This is the point where things
  start to go wrong.  We then have 4 repeated file_all_info requests
  each followed by an access denied in packets 17 through 24.  Of
  course, one looks at this as says WTF?  You JUST created the file (and
  the 0 byte file does, indeed, appear in the location requested) so why
  would a request for file info return an access denied?  I checked the
  file handles and they appear consistent.  Unfortunately, I don’t know
  enough about the inner workings of SMB to see if there was anything
  incorrect about the file create in packet 15 or the response in packet
  16, but that seems like a good place to start looking.  After the
  access denied packets, the client must get the hint and requests a
  close of the file handle in packet 25 and gets an OK from the server
  to that in packet 26.

  So, I stumbled around with this problem for a while comparing SMB1
  with SMB2 (e.g. apples and oranges or, I guess more accurately,
  oranges and grapefruits) until someone in the forums suggested using
  the mount command instead of GVFS / Nautilus.  I did that and things
  worked just fine.

  To the SMB share on a Synology Diskstation NAS set to SMB2:

  sudo mount -t cifs //diskstation/fileshare /media/NAS -o
  username=sawozny,vers=2.0,uid=1000

  echo "This is a test." > /media/NAS/Test-File-SMB2-MOUNT-NAS.txt

  The command executes with no feedback (other than the return of the
  prompt awaiting the next command) and the file of the name specified
  with the contents specified appears at the root of the NAS file share.

  To the SMB share on a Windows 7 Professional workstation set to SMB2:

  sudo mount -t cifs //hp_laptop/videos /media/Win7 -o
  username=sawozny,vers=2.0,uid=1000

  echo "This is a test." > /media/Win7/Test-File-SMB2-MOUNT-Win7.txt

  The command also executes with no feedback (other than the return of
  the prompt awaiting the next command) and the file of the name
  specified with the contents specified appears at the root of the
  “videos” file share on the Win7 workstation.

  Note: I know I shouldn’t need to specify SMB dialect when doing the
  mount because the server is set to only make an SMB2 connection, but
  mounting doesn’t work for me if I don’t specify the version number.
  Maybe I’m doing something wrong here, but since the point was to make
  an SMB2 connection and I did I’m not burning a lot of cycles on why I
  needed to use an additional parameter that shuoldn’t have been needed.

  Digging into the PCAPs (attached), it’s clear that what’s going on
  under the hood is very different.  The PCAPs both appear identical, so
  there do not appear to be any server side differences affecting the
  testing.  The PCAP starts right out of the gate with a disposition 5
  (overwrite if the file already exists; otherwise, create it) request
  of the correct name in packet 1 and a handle returned in packet 2,
  then a file internal info request in packet 4 and a response in packet
  5, followed by a set EOF info in packet 6 with a success response in
  packet 7.  Next a disposition 1 (open if file exists, else fail)
  request in packet 8 followed by a success return in packet 9 followed
  by a set info setting the last write and last change dates in packet
  10, followed by a success return in packet 11 and a close of the
  handle opened in packet 9 in packet 12 with a success return in packet
  13, then the actual data write in packet 14 followed by a success
  return in packet 15, a close handle (from packet 2) in packet 16 and a
  successful return of that request in packet 17.

  And the file is where it’s supposed to be with the correct contents on
  an SMB2 share.

  Unfortunately, this doesn’t really help pin down the nature of the
  issue.  If the SMB2 conversations were the same to a certain point and
  diverged it would be easier to point to a spot and say “look here” but
  like everything else in computing there’s more than one way to do
  almost any function.  If we were to focus on the actual moment of file
  creation, then in the GVFS SMB2 PCAPs there is no OpLock (0) vs batch
  OpLock (9) in the MOUNT PCAP, also the Access Mask in the GVFS PCAP is
  0x00120116 vs 0x40000080 in the MOUNT PCAP, the Share Access is
  0x0000003 GVFS vs 0x0000007 MOUNT, the disposition of the create is
  different as described and there is an
  SMB2_CREATE_DURABLE_HANDLE_REQUEST in the MOUNT PCAP that is not in
  the GVFS PCAP, but I don’t know if anything that happened before or
  after the moment of file creation in both the failed and successful
  PCAPs affected that moment, making those differences make sense.  So
  all the PCAPs are attached for your review to see what else might be
  making the difference.

  Finally, per the instructions at
  https://wiki.gnome.org/Projects/gvfs/debugging I ran 2 debugs of this
  issue.  One was with the base instructions for a debug and the other
  was with GVFS_SMB_DEBUG set to 10.  These were not done at the same
  time as the PCAPs to avoid Heisenberg differences, but the mount steps
  and echo command were the same, so if you follow the latter part of
  the debug (the packet captures only have the actual transfer where the
  debug log has to have everything from the launch of Nautilus) it
  should be pretty easy to match up what’s happening.  Unfortunately
  (AGAIN) not much additional info seems to be offered.  The failed
  requests for the file name show up (2 “No such directory or file”
  returns in the log, rather than the 4 STATUS_OBJECT_NAME_NOT_FOUND
  packets in the PCAP; not sure why the discrepancy) but the write
  returns no error code in the log, despite the fact that in the PCAP it
  doesn’t appear to even have gotten to the stage of writing.  Perhaps
  the repeated STATUS_ACCESS_DENIED in the PCAP were part of that stage
  in the process, but a failure at that point wasn’t something GVFS-SMB
  knew how to handle so it didn’t return an error code on that function
  call (but also didn’t do the actual writing).  The only other thing I
  can find in that file of interest is a GvfsJobCreateMonitor call that
  returned “Operation not supported by backend” but in context it seems
  to be for the creation of a directory monitor which doesn’t seem to be
  that life threatening a failure.  Anyway, I’m just guessing at this
  point so I included the logs in case there’s something important in
  there I missed.

  If you’ve made it this far, I hope this has been useful to tracking
  down this bug.  If there is anything I can do to help in terms of
  further information gathering or testing, please let me know.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+bug/1815395/+subscriptions

-- 
Mailing list: https://launchpad.net/~desktop-packages
Post to     : desktop-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~desktop-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to