#10530: De Bruijn Sequence construction for combinat
---------------------------------+------------------------------------------
Reporter: eviatarbach | Owner: sage-combinat
Type: enhancement | Status: needs_work
Priority: major | Milestone: sage-4.6.1
Component: combinatorics | Keywords:
Author: Eviatar Bach | Upstream: N/A
Reviewer: Nicolas M. ThiƩry | Merged:
Work_issues: |
---------------------------------+------------------------------------------
Comment(by nthiery):
Replying to [comment:4 eviatarbach]:
> Thank you for the comments. I have a few questions:
>
> ''Rather than defining an attribute self.k_is_alphabet, I would setup
two attributes self.k and self.alphabet once for all in the init. Then no
more branching needs to be done later on.''
>
> I know it looks ugly. The point of having k_is_alphabet is to check
whether the input ''k'' is an alphabet or an arity, as the processing is
different in either case.
It feels only marginaly different. For example, if alphabet is set to
[0,...,k-1], then the last ``if`` becomes unnecessary; granted, there
is an overhead of an indexed lookup in an array, but I assume it
should be negligible (to be checked with timeit though!).
> I was thinking of removing the alphabet functionality; maybe just the
arity would suffice? One could always map values onto the digits after.
As you feel like; that functionality can indeed be added later.
> ''The length of the sequence is known in advance, right? Then it should
be faster to allocate a bit list right away, and then fill it up, rather
than using append.''
> I'm not clear on this. Are you suggesting I compute the length of the
array in advance, create ''sequence'' full of zeros to that length, and
then use indeces to change the values?
Yes.
> Is append that much slower that creating a long list and then filling it
would be faster?
With append, Python needs to regularly reallocate the list in memory,
requiring a copy of it each time. In principle, Python uses
"exponential" reallocation so the theoretical complexity of append is
"amortized O(1)", but there still is an overhead.
I recommend trying both, and benchmarking with timeit (and post here
the result for our instruction).
> Or are you suggesting that I declare sequence as an int array in C,
> then fill it? This was my initial idea, but I couldn't find a way to
> insert arguments into the declaration. This returns a traceback:
>
> cdef sequence[k^n]
I don't know what's the canonical way to allocate C arrays in
Cython. Probably via malloc or something. It will be in the Cython
doc. Anyway, since the result will be returned as a list, I assume
that copying back the result into a list would ruin the benefit of
having used a C array in the first place.
Ah, by the way: please add tests for all the base cases: k=0,
alphabet=0, ...
Also please add tests for the direct call to ``debruijn_cython``.
Speaking of which I suggest renaming it to debruijn_sequence, and
adding a link back to DeBruijnSequences.
Cheers,
--
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/10530#comment:5>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sage-trac?hl=en.