bug#36402: installation error

2020-03-18 Thread Mathieu Othacehe


Hello,

This has hopefully been resolved by Guile-Parted 0.0.2 update, so
closing!

Thanks,

Mathieu





bug#36402: installation error

2019-09-08 Thread Ludovic Courtès
Hello,

Mathieu Othacehe  skribis:

>> So perhaps you should define your own ‘define-wrapped-type’ macro that
>> does ‘define-record-type’ + the weak hash table thing, and replace all
>> ‘define-record-type’ instances in structs.scm with
>> ‘define-wrapped-type’.  How does that sound?
>
> Seems like the right thing to do :) However, I had a look to all Parted
> functions which result is passed to a pointer->X! function, and except
> ped_device_get, they always return newly allocated objects. So I guess
> we are safe for now.

OK, sounds good!

Ludo’.





bug#36402: installation error

2019-09-05 Thread Mathieu Othacehe


Hey,

> So perhaps you should define your own ‘define-wrapped-type’ macro that
> does ‘define-record-type’ + the weak hash table thing, and replace all
> ‘define-record-type’ instances in structs.scm with
> ‘define-wrapped-type’.  How does that sound?

Seems like the right thing to do :) However, I had a look to all Parted
functions which result is passed to a pointer->X! function, and except
ped_device_get, they always return newly allocated objects. So I guess
we are safe for now.

Thanks,

Mathieu





bug#36402: installation error

2019-09-05 Thread Ludovic Courtès
Hello!

Mathieu Othacehe  skribis:

> And here is what was going wrong:
>
> ped_device_get and ped_device_get_next can return pointers to already
> existing device object. So set-pointer-finalizer! was possibly called
> multiple times on the same device pointer, resulting in calling
> ped_device_destroy multiple times on the same device pointer.
>
> To prevent that, I created a weak value hash table to make sure that one
>  object maps to exactly one device pointer, and that the pointer
> finalizer is set only once. See commit b35839b.

Good catch!

I confirm that:

  guix build guile-parted --with-branch=guile-parted=master --check

passed several times in a row.  :-)

b35839b LGTM!

(‘define-wrapped-pointer-type’ takes care of this, but we can’t use it
while we use bytestructures (info "(guile) Void Pointers and Byte
Access").)

It seems to me that the fix should be not just for ‘pointer->device!’
but for all the ‘pointer->RECORD!’ procedures, where we potentially have
similar scenarios, and where we’d rather have:

  (eq? (pointer->X ptr) (pointer->X ptr))

So perhaps you should define your own ‘define-wrapped-type’ macro that
does ‘define-record-type’ + the weak hash table thing, and replace all
‘define-record-type’ instances in structs.scm with
‘define-wrapped-type’.  How does that sound?

Thank you!

Ludo’.





bug#36402: installation error

2019-09-03 Thread Ludovic Courtès
Hello,

Mathieu Othacehe  skribis:

>> It might be useful to add calls to ‘gc’ here and there in the tests to
>> stress-test memory management.
>
> Inserting gc calls here:
>
> (test-assert "partition-remove extended"
>   (with-tmp-device
>"device-extended.iso"
>(lambda (new-device)
>  (let* ((device (get-device new-device))
> (disk (disk-new device))
> (partitions (disk-partitions disk))
> (extended-partition (find extended-partition? partitions)))
>(gc) ; <-- Try to destroy disk?
>(disk-remove-partition* disk extended-partition)
>(gc)
>(equal? (extended-partition-count disk) 0)
>
> causes a segfault. Is it legal to call GC here? Do you have any clue on
> how to investigate what the GC is doing?

GC might run at any time, so yes, it’s valid to insert calls to ‘gc’
anywhere.  So this is good, this is kind of issue we want to catch.  :-)

To investigate, I would recommend re-reading how memory management works
in Parted.  Questions such as:

  1. Can Parted free a C object (disk, partition, etc.) behind your
 back?  Is there a way to prevent it?

  2. When a Parted object aggregates another object, how’s memory
 managed?  For example, if a “disk” aggregates (refers to) a
 “partition”, who’s responsible for freeing that partition?

  3. Relatedly, if, say, a “disk” aggregates a “partition”, do you make
 sure on the Scheme side that you do not free the partition while
 the disk is still alive?

 You can make sure this doesn’t happen by using a weak-key hash
 table, as discussed before, where the key is the disk and the value
 is the list of partitions it aggregates.

If you can get a backtrace from the core dump, that might give clues.

Setting the environment variable:

  export GLIBC_TUNABLES=glibc.malloc.check=1

might tell you if it’s a double-free error or something.

You can also use Valgrind though libgc creates a lot of noise there.

Please share whatever you gather before you get depressed.  ;-)

HTH!

Ludo’.





bug#36402: installation error

2019-09-02 Thread Mathieu Othacehe


Hey,

I pushed the missing file :).

> It might be useful to add calls to ‘gc’ here and there in the tests to
> stress-test memory management.

Inserting gc calls here:

--8<---cut here---start->8---
(test-assert "partition-remove extended"
  (with-tmp-device
   "device-extended.iso"
   (lambda (new-device)
 (let* ((device (get-device new-device))
(disk (disk-new device))
(partitions (disk-partitions disk))
(extended-partition (find extended-partition? partitions)))
   (gc) ; <-- Try to destroy disk?
   (disk-remove-partition* disk extended-partition)
   (gc)
   (equal? (extended-partition-count disk) 0)
--8<---cut here---end--->8---

causes a segfault. Is it legal to call GC here? Do you have any clue on
how to investigate what the GC is doing?

Thanks,

Mathieu





bug#36402: installation error

2019-09-01 Thread Ludovic Courtès
Howdy,

Mathieu Othacehe  skribis:

> I followed your advice and hid all destroy related functions behind
> pointer finalizers. I also added some unit tests to Guile-Parted.

Nice!

I tried “guix build guile-parted --with-branch=guile-parted=master”, but
that fails because ‘build-aux/test-driver.scm’ is missing from the repo,
I think.

It might be useful to add calls to ‘gc’ here and there in the tests to
stress-test memory management.

> I pushed everything but feel free to comment! Then, I'll make a new
> release and try to adapt the installer to those changes.

Awesome, thanks a lot!

Ludo’.





bug#36402: installation error

2019-08-31 Thread Mathieu Othacehe


Hey Ludo,

> Does that make sense?
>
> I think we should audit and adjust Guile-Parted in that spirit.  WDYT?

Sorry for the delay!

I followed your advice and hid all destroy related functions behind
pointer finalizers. I also added some unit tests to Guile-Parted.

I pushed everything but feel free to comment! Then, I'll make a new
release and try to adapt the installer to those changes.

Thanks,

Mathieu





bug#36402: installation error

2019-07-02 Thread Ludovic Courtès
Hi,

Mathieu Othacehe  skribis:

>> Does that make sense?
>>
>> I think we should audit and adjust Guile-Parted in that spirit.  WDYT?
>
> Yes, it seems like the right thing to do. I'll try to apply those
> changes to Guile-Parted next week. However, as we cannot reproduce those
> null-pointer issues, we won't be sure if we fixed them for sure.

Perhaps we can reproduce them by adding a bunch of calls to ‘gc’ in the
code here and there.  That’s often a good way to stress-test memory
management.

> In the meantime, and completely unrelated, I created a new
> wip-cross-system branch to fix most of the cross-compilation issues
> preventing from cross-building a Guix System.

Neat!  Consider opening a new issue for this.  :-)

Thank you,
Ludo’.





bug#36402: installation error

2019-06-29 Thread Mathieu Othacehe


Hey Ludo,

> Does that make sense?
>
> I think we should audit and adjust Guile-Parted in that spirit.  WDYT?

Yes, it seems like the right thing to do. I'll try to apply those
changes to Guile-Parted next week. However, as we cannot reproduce those
null-pointer issues, we won't be sure if we fixed them for sure.

In the meantime, and completely unrelated, I created a new
wip-cross-system branch to fix most of the cross-compilation issues
preventing from cross-building a Guix System.

Thanks,

Mathieu





bug#36402: installation error

2019-06-27 Thread Ludovic Courtès
Hi Juan,

Juan  skribis:

> I ran into some trouble while attempting to install Guix SD (1.0.1.x86_64). 
> It happens when I try to do the guided graphical installation, I'll 
> transcript the whole text here:

[...]

> 755:33 14 (run-partitioning-page)
> In ./gnu/installer/parted.scm:
>1010:14 13 (auto-partition! #< bytestructure: # 0x106d840>> #:scheme _)
> 870:21 12 (loop _ _ _)
> 863:17 11 (loop _ 2617712816 1289318400)
> 771:25 10 (mkpart #< bytestructure: #> _ 
> #:previous-partition _)
> In parted/structs.scm:
> 552:19 9 (pointer->partition _)
>  132:3 8 (pointer->bytestructure # 
> #)
> In unknown file:
> 7 (pointer->bytevector # 88 # #)
> In ice-9/boot.scm:
> 751:25 6 (dispatch-exception 5 null-pointer-error ("pointer->bytevector" 
> "null pointer dereference" () ()))

That looks like what was reported at
, so I’ve merged both.  Thanks
for the report, Juan!

Mathieu, in the same spirit as
, I think we have an object
life cycle and memory management issue.

I hadn’t noticed but we’re doing manual memory management by calling
things like ‘disk-destroy’ in the installer.  That’s crash-prone and
best avoided.

The usual way to handle it in bindings is by:

  1. Adding pointer finalizers.  So for example the pointer object
 associated with a  record would have a finalizer that calls
 ‘ped_disk_destroy’.

  2. Having a weak-key hash table to track object dependencies when
 needed.  So, if a  aggregates a , there must be an
 entry in the hash table that maps the  to the .  That
 way, we ensure that the  object remains live as long as the
  is live.

We can expose “close” functions that free OS resources such as file
descriptors, but we should not expose deallocation functions like
‘ped_disk_destroy’; instead, we let the GC call them when the objects
become unreachable.

Does that make sense?

I think we should audit and adjust Guile-Parted in that spirit.  WDYT?

Thanks,
Ludo’.





bug#36402: installation error

2019-06-27 Thread Juan
Hi!

I ran into some trouble while attempting to install Guix SD (1.0.1.x86_64). It 
happens when I try to do the guided graphical installation, I'll transcript the 
whole text here:

"
The installer has encountered an unexpected problem. The backtrace is displayed 
below. Please report it by email to .

In ice-9/boot-9.scm:
829:9 19 (catch srfi-34 # # _)
829:9 18 (catch srfi-34 # # _)
829:9 17 (catch srfi-34 # # _)
829:9 16 (catch srfi-34 # # _)
In ./gnu/installer/steps.scm:
182:21 15(_)
In ./gnu/installer/newt/partition.scm:
755:33 14 (run-partitioning-page)
In ./gnu/installer/parted.scm:
   1010:14 13 (auto-partition! #< bytestructure: #> #:scheme _)
870:21 12 (loop _ _ _)
863:17 11 (loop _ 2617712816 1289318400)
771:25 10 (mkpart #< bytestructure: #> _ 
#:previous-partition _)
In parted/structs.scm:
552:19 9 (pointer->partition _)
 132:3 8 (pointer->bytestructure # #)
In unknown file:
7 (pointer->bytevector # 88 # #)
In ice-9/boot.scm:
751:25 6 (dispatch-exception 5 null-pointer-error ("pointer->bytevector" 
"null pointer dereference" () ()))
In ice-9/eval.scm:
619:8 5 (_ #(#(# #< name: newt 
init: # exit: #procedure exit ()> exit-error:
# final-p...>) ...))

619:8 4 (_ #(#(#(# #< name: newt 
init: # exit: #procedure exit ()> exit-error:
# fi...>) ...) #))
In ice-9/ports.scm:
462:17 3 (call-with-output-file _ _ #:binary _ #:encoding _)
In ice-9/eval.scm:
619:8 2 (_ #(#(# null-pointer-error 
("pointer->bytevector" "null pointer dereference" () ())) #))
159:9 1 (_ #(#(# null-pointer-error 
("pointer->bytevector" "null pointer dereference" () ())) #))
In unknown file:
0 (make-stack #t)
ice-9/eval.scm:159:9: In procedure pointer->bytevector: null pointer dereference
"

Here are the specifics of my computer:
ASUS MAXIMUS VI IMPACT ACPI BIOS Revision 1301
CPU: Intel Core 15-4440 CPU @ 3.19GHz

I'm not sure if it's a hardware compatibility problem, a bug in the guided 
graphical installation, or something else.

Thanks in advance, kind regards.