@Araq - B-Tree in the stdlib will be cool !! Thanks in advance !
Memory hierarchies since the 1990s have broadened the relevance of 1970s
practical wisdom for disk storage. A solid hash table with locality and solid
B-Tree often win today. A great stdlib should have both.
A more obscure bonus of a B-Tree (when compared to, say, balanced binary trees)
is chea
@cblake I entirely agree, B-trees are pretty awesome. When learning about Rust
iterators I discovered that it's often faster to turn a container into a B-tree
and then collect it into the starting container type than to operate directly
on the initial container.
@Araq - fabulous. I think a B-Tree in the stdlib would be great.
@mratsim - I think you meant `collections.intsets` with a 't' which does _not_
yield items() in key order.
The built-in `set[up to int16]` type _does_ (implicitly) order its keys,
though. @cdome also did not mention how large the
`insets` dedups for sure and seems extremely fast. I don't know if it's items
is sorted though.
I have a B-Tree implementation. Remind me to release it.
`CountTable` is just a histogram and the sorting is by the bin counts not by
the keys. Keys are visited in "hash order". `hashes` does use an identity hash
for integers which could create some confusion from a simple test program if
the modulo the table size post hash transform doesn't change th
`CountTable` from the `tables` library can also do this. I don't know how it
compares speed-wise with the previous solution
The `keys()` iterator will retrieve the sorted numbers (it sorts on insert??),
and the value is the number of occurrences.
If you already have a sequence of data, you can
`collections.heapqueue` does mostly what you want. You would have to remove
duplicates yourself, like below, but probably with an iterator called something
like `unique`.
import collections.heapqueue
var heap = newHeapQueue[int]()
let data = [1, 3, 5, 1, 7, 9, 2, 4, 6,
I have bunch of ints that I need to insert into set and read in increasing
order avoiding duplicates. They are produced at a different time so I don't
have them in a seq.
I can probably knock-up a quick SortedSeq as a distinct seq in my code. I have
a good capacity estimate for my sue case, so
10 matches
Mail list logo