Re: Strange Bug in LDC vs DMD

2017-06-30 Thread FoxyBrown via Digitalmars-d-learn

On Friday, 30 June 2017 at 20:13:37 UTC, H. S. Teoh wrote:
On Fri, Jun 30, 2017 at 07:57:22PM +, FoxyBrown via 
Digitalmars-d-learn wrote: [...]

[...]


Um... the docs explicit say that dirEntries is lazy, did you 
not see that?


[...]


It is possible that dmd has the same problem but I did not see 
it. What I did was develop it using dmd then once it was working 
went to release ldc and saw that it immediately did not have the 
same results and the results were wrong.
Since I was debugging it the whole time and it was working fine 
with dmd, I simply assumed dmd was working.


Since it is a lazy range, I'm sure that is the problem.



Re: Strange Bug in LDC vs DMD

2017-06-30 Thread H. S. Teoh via Digitalmars-d-learn
On Fri, Jun 30, 2017 at 07:57:22PM +, FoxyBrown via Digitalmars-d-learn 
wrote:
[...]
> The only way this can happen is if the rename command is somehow
> feeding back in to the algorithm. Since the problem goes away when I
> pre-compute dirEntries, it suggests that dirEntries is being lazily
> computed.

Um... the docs explicit say that dirEntries is lazy, did you not see
that?

https://dlang.org/phobos/std_file#dirEntries

"Returns an input range of DirEntry that *lazily* iterates a given
directory, ..." [emphasis mine]


[...]
> I'm pretty sure that the analysis above is correct, that is,
> dirEntries is lazy and ends up picking up the renamed file. This is
> sort of like removing an element in an array while iterating over the
> array.

This is certainly what it looks like; however, it doesn't explain a
couple of things:

1) Why the DMD version appears to be unaffected, since as far as I can
tell from the code, it is also a lazy iteration;

2) On Linux at least, renaming a file does not move the location of the
entry in the directory, so whether dirEntries is lazy or not shouldn't
even matter in the first place.

Does Windows reorder the directory when you rename files? E.g., if you
set the folder to sort alphabetically, does it actually sort the
directory, or does it only sort the GUI output?  My guess is that the
sort order only affects the GUI output, as it would be grossly
inefficient to actually sort the directory. But you never know with
Windows...  In any case, if Windows *does* physically sort the folder,
that could explain how rename() affects dirEntries.  However, this still
doesn't explain the discrepancy between DMD and LDC.

Actually, now that I think of it... Linux may do the same thing if the
new filename is longer and doesn't fit in the old slot. So that could
explain (2).  However, why the difference between DMD and LDC?  It
doesn't make sense to me, if you tested both on the same OS.

Here's a way to rule out (2): instead of using the current working
directory, change the code to create a fresh copy of the directory each
time.  Does DMD / LDC still show a difference?  The idea here is that it
may have been a coincidence that you saw LDC having the problem and DMD
not, since whether or not a renamed file gets moved depends on how big
the current slot for its name is in the directory, and if you've already
done a bunch of operations on the directory, some slots will be bigger
and some will be smaller, so some renames will happen in-place whereas
others will cause a reordering.  It could be you just got unlucky with
LDC and caused a reordering, whereas you got lucky with DMD and the
existing slots were already big enough so the problem isn't visible.

OTOH, this still doesn't explain why calling the OS functions directly
fixes the problem.  If there is a bug in LDC's version of dirEntries
somewhere, we'd like to know about it so that we can fix it.


T

-- 
Without outlines, life would be pointless.


Re: Strange Bug in LDC vs DMD

2017-06-30 Thread FoxyBrown via Digitalmars-d-learn

On Friday, 30 June 2017 at 17:32:33 UTC, H. S. Teoh wrote:
On Fri, Jun 30, 2017 at 12:50:24PM +, FoxyBrown via 
Digitalmars-d-learn wrote:

I am using dirEntries to iterate over files to rename them.

I am renaming them in a loop(will change but added code for 
testing).



In DMD the renaming works but in LDC the renaming fails. It 
fails in a way that I can't quite tell and I cannot debug 
because visual D is not working properly for LDC.


The code essentially look like the following:


auto dFiles = dirEntries(loc, mask, _mode);

foreach (d; dFiles)
{

   auto newName = Recompute(d.name)
   writeln(newName);
   rename(d.name, newName);
}

but when I comment out rename, it works under LDC.

The funny thing is, newName is printed wrong so Recompute is 
effected by the rename.


This shouldn't occur.

[...]

This sounds very strange.  What exactly do you mean by "newName 
is printed wrong"? Do you mean that somehow it's getting 
affected by the *subsequent* rename()?  That would be truly 
strange.  Or do you mean that newName doesn't match what you 
expect Recompute to do given d.name? Perhaps you should also 
print out d.name along with newName just to be sure?


Do you have a reduced code example that's compilable/runnable?  
It's rather hard to tell what's wrong based on your incomplete 
snippet.



T


No, if I simply comment out the rename line, then the writeln 
output changes. Simple as that. No other logic changes in the 
code.


This means that the rename is affecting the output. The recompute 
code gets the filename, does a computation on it, then returns 
it.. prints it out, then renames that file to the newly computed 
file name.


The only way this can happen is if the rename command is somehow 
feeding back in to the algorithm. Since the problem goes away 
when I pre-compute dirEntries, it suggests that dirEntries is 
being lazily computed. If that is the case, then the problem is 
easily understood: The file gets renamed, dirEntries reiterates 
over the file, then it gets recomputed again, but this time the 
result is bogus because it is a double recompute, which is 
meaningless in this program.


I'm pretty sure that the analysis above is correct, that is, 
dirEntries is lazy and ends up picking up the renamed file. This 
is sort of like removing an element in an array while iterating 
over the array.


The odd thing is, is that DMD does not produce the same result. I 
do not know if there is a different in the LDC vs DMD dirEntries 
code(or lazily evaluated code in general) or if it has to do with 
speed(possibly the renames are cached and do not show up 
immediately to dirEntries with the slower DMD?).


I do not have any simplified code and I'm moving on from here. It 
should be easy to mock something up. The main thing to do is to 
rename the files based on something in the file name.


e.g., suppose you have the files 1,2,3,4,5 (that is there names)

and extract and multiply the filenames by 10. (that is your 
recompute function).


You should end up with 10,20,30,40,50.

But if the cause of issue I'm describing is in fact true, one 
don't necessarily get that because some files will be iterated 
more than once. e.g., maybe 10, 100, 1000, 20, 200, 30, 40, 50, 
500.


I am doing it over a lot of files btw, but that is essentially 
what is going on.  The example above should be easy to do since 
one can simply to!int the filename and then multiply it by 10 and 
then rename that.


I have moved on to avoid dirEntries completely and simply use the 
os directory listing function manually to extract the data but 
this should be investigated as it if it the behavior is what I am 
describing, a serious bug exists somewhere. (if someone could 
confirm that dirEntries is a lazy range, then it would explain 
the problem, but not necessarily why dmd and ldc differ, (dmd 
seeming to function as expected)).







Re: Strange Bug in LDC vs DMD

2017-06-30 Thread H. S. Teoh via Digitalmars-d-learn
On Fri, Jun 30, 2017 at 12:50:24PM +, FoxyBrown via Digitalmars-d-learn 
wrote:
> I am using dirEntries to iterate over files to rename them.
> 
> I am renaming them in a loop(will change but added code for testing).
> 
> 
> In DMD the renaming works but in LDC the renaming fails. It fails in a
> way that I can't quite tell and I cannot debug because visual D is not
> working properly for LDC.
> 
> The code essentially look like the following:
> 
> 
>   auto dFiles = dirEntries(loc, mask, _mode);
>   
>   foreach (d; dFiles)
>   {   
> 
>auto newName = Recompute(d.name)
>writeln(newName);
>rename(d.name, newName);
> }
> 
> but when I comment out rename, it works under LDC.
> 
> The funny thing is, newName is printed wrong so Recompute is effected
> by the rename.
> 
> This shouldn't occur.
[...]

This sounds very strange.  What exactly do you mean by "newName is
printed wrong"? Do you mean that somehow it's getting affected by the
*subsequent* rename()?  That would be truly strange.  Or do you mean
that newName doesn't match what you expect Recompute to do given d.name?
Perhaps you should also print out d.name along with newName just to be
sure?

Do you have a reduced code example that's compilable/runnable?  It's
rather hard to tell what's wrong based on your incomplete snippet.


T

-- 
An imaginary friend squared is a real enemy.


Re: Strange Bug in LDC vs DMD

2017-06-30 Thread FoxyBrown via Digitalmars-d-learn

On Friday, 30 June 2017 at 15:07:29 UTC, Murzistor wrote:

On Friday, 30 June 2017 at 12:50:24 UTC, FoxyBrown wrote:

The funny thing is, newName is printed wrong so Recompute is 
effected by the rename.

Does LDC use Unicode?
Or, maybe, standard library under LDC does not support Unicode 
- then it is a serious bug.

Do you use any non-ASCII symbols?
Maybe, the Recompute() function returns a non-unicode string 
under LDC?


None of these reasons make sense. They do not take in to account 
that simply pre-evaluating dirEntries causes it to work nor the 
fact that I said that commenting the rename out also works.




Re: Strange Bug in LDC vs DMD

2017-06-30 Thread Murzistor via Digitalmars-d-learn

On Friday, 30 June 2017 at 12:50:24 UTC, FoxyBrown wrote:

The funny thing is, newName is printed wrong so Recompute is 
effected by the rename.

Does LDC use Unicode?
Or, maybe, standard library under LDC does not support Unicode - 
then it is a serious bug.

Do you use any non-ASCII symbols?
Maybe, the Recompute() function returns a non-unicode string 
under LDC?