Re: testing for deprecation
On Thursday, 1 September 2016 at 11:11:15 UTC, Cauterite wrote: How does one test whether a symbol is deprecated? I would have expected something like: __traits(isDeprecated, foo). Such a trait makes it possible to write code that will break, just because something has been marked as deprecated. Doesn't it break the purpose of deprecation?
xml utf-8 encoding error
Hi, I'm having some trouble getting an xml file using std.net.curl. I'm using get() to receive device info from a roku television using this code: char[] inputQuery(string input) { string url = ip ~ "query/" ~ input; auto client = HTTP(); auto content = get(url,client); return content; } I've had no problem in the past using similar code to receive JSON data from a web server. However in this case I run into this error: std.encoding.EncodingException@std/encoding.d(2346): Unrecognized Encoding: "utf-8" ??:? std.encoding.EncodingScheme std.encoding.EncodingScheme.create(immutable(char)[]) [0x9773ff] /usr/include/dmd/phobos/std/net/curl.d:1196 char[] std.net.curl._decodeContent!(char)._decodeContent(ubyte[], immutable(char)[]) [0x7951cb] /usr/include/dmd/phobos/std/net/curl.d:1049 char[] std.net.curl._basicHTTP!(char)._basicHTTP(const(char)[], const(void)[], std.net.curl.HTTP) [0x793559] /usr/include/dmd/phobos/std/net/curl.d:540 char[] std.net.curl.get!(std.net.curl.HTTP, char).get(const(char)[], std.net.curl.HTTP) [0x795b77] source/backend.d:26 immutable(char)[] backend.inputQuery(immutable(char)[]) [0x7886bd] When I use cURL directly to get the info I get this: curl -v --request GET http://192.168.1.140:8060/query/device-info Connected to 192.168.1.140 (192.168.1.140) port 8060 (#0) GET /query/device-info HTTP/1.1 Host: 192.168.1.140:8060 User-Agent: curl/7.47.0 Accept: */* < HTTP/1.1 200 OK < Server: Roku UPnP/1.0 MiniUPnPd/1.4 < Content-Length: 1826 < Cache-Control: no-cache < Content-Type: text/xml; charset="utf-8" < This seems to be the relevant code in curl.d: private auto _decodeContent(T)(ubyte[] content, string encoding) { static if (is(T == ubyte)) { return content; } else { import std.format : format; // Optimally just return the utf8 encoded content if (encoding == "UTF-8"||encoding == "utf-8") return cast(char[])(content); // The content has to be re-encoded to utf8 auto scheme = EncodingScheme.create(encoding); enforce!CurlException(scheme !is null, format("Unknown encoding '%s'", encoding)); I'm not sure what the problem is. It seems to be may the lowercase 'utf-8' in the charset section but I'm not sure if the problem is some mistake I made, a bug in DMD or just lousy xml. Either way is there any way around this issue?
Re: General performance tip about possibly using the GC or not
On Tuesday, 29 August 2017 at 00:52:11 UTC, Cecil Ward wrote: I am vacillating - considering breaking a lifetime's C habits and letting the D garbage collector make life wonderful by just cleaning up after me and ruining my future C disciple by not deleting stuff myself. The tsv command line tools I open-sourced haven't any problems with GC. They are only one type of app, perhaps better suited to GC than other apps, but still, it is a reasonable data point. I've done rather extensive benchmarking against similar tools written in native languages, mostly C. The D tools were faster, often by significant margins. The important part is not that they were faster on any particular benchmark, but that they did well against a fair variety of tools written by a fair number of different programmers, including several standard unix tools. The tools were programmed using the standard library where possible, without resorting to low-level optimizations. I don't know if the exercise says anything about GC vs manual memory management from the perspective of maximum possible code optimization. But, I do think it is suggestive of benefits that may occur in more regular programming, in that GC allows you to spend more time on other aspects of your program, and less time on memory management details. That said, all the caveats, suggestions, etc. given by others in this thread apply to my programs to. GC is hardly a free lunch. Benchmarks on the tsv utilities: https://github.com/eBay/tsv-utils-dlang/blob/master/docs/Performance.md Blog post describing some of the techniques used: https://dlang.org/blog/2017/05/24/faster-command-line-tools-in-d/ --Jon
Re: D Multidimensional arrays wierdness
On Tuesday, 29 August 2017 at 03:16:13 UTC, Johnson Jones wrote: T[] arr and is a linear sequential memory array of T's with an unbound length and is effectively the same as T*(although D treats them differently?)? It is a T* AND a size variable bundled together. We can fix the length by adding a upper bound: T[N] arr; That's actually an entirely different type. This is now a block of static memory instead of a pointer+length combo. and this is equivalent to auto arr = cast(T[])malloc(T.sizeof*N)[0..N]; No, it isn't. The static array T[N] has no such allocation, it is all in-place. Now, when it comes to multidimensional arrays: T[][] arr; That's an array of arrays. A pointer+length combo to a bunch of other pointer+length combos. or, lets used a fixed array so we can be clear; T[N][M] which means we have M sequential chunks of memory where each chunk is a T[N] array. That's correct though, with the caveat that there's no padding between the inner arrays; the T[N][M] is a single block of memory with size N * M * T.sizeof. Similary, to access the element at the nth element in the mth chunk, we do t[n][m] because, again, this conforms with out we think of single arrays. I've never understood how anyone thought that made any sense. I have always thought C was backwards and illogical for that nonsense and even after many years of using C, I'd still often get it "backwards". D doing the sane thing and having consistent order is one of the things I like about the language. T[N][M] one must access the element as t[m][n]! The accessors are backwards! What is t[m]? It gets the m'th element of t, right? Well, since t is an array of N items, it makes perfect logical sense that t[m] is is of type T[N]. Just like how T[N]* is a pointer to a T[N] (which is an array of T), or if it were a function, t(n) would be its return value. D always follows this simple recursive rule. Now, what's important about all this is the built-in things are arrays of arrays rather than multi-dimensional arrays. D also supports multi-dimensional arrays, which are indexed: T[x, y]. The actual type of them is not built into the language though - it needs to be defined as a library type. I'm not sure it ever got into phobos but it is available here: https://github.com/libmir/mir-algorithm Or you can write your own pretty quickly with a static array + the opIndex overload.
Re: D Multidimensional arrays wierdness
On Tuesday, August 29, 2017 03:16:13 Johnson Jones via Digitalmars-d-learn wrote: > I need to get this straight: > > A normal single dimensional array in D is defined as > > T[] arr > > and is a linear sequential memory array of T's with an unbound > length and is effectively the same as T*(although D treats them > differently?)? A dynamic array is essentially a struct defined like so struct DynamicArray(T) { size_t length; T* ptr; } In actuality, for historical reasons, I think that it's unfortunately defined with void* at the moment instead of being templated, but effectively, that struct is what you're dealing with when you have a dynamic array. If you want the C/C++ equivalent, you use the ptr property to get at the underlying pointer. > We can fix the length by adding a upper bound: > > T[N] arr; > > and this is equivalent to > > auto arr = cast(T[])malloc(T.sizeof*N)[0..N]; > possibly > > auto arr = cast(T[N])malloc(T.sizeof*N); They are not at all equivalent. Fixed size arrays live on the stack, just like you'd get with T arr[N]; in C/C++. Static arrays have a fixed length and live on the stack (or directly inside of an object on the heap if they're a member variable of an object on the heap), and they have no indirections. With dynamic arrays, the struct itself that is the dynamic array lives directly on the stack or wherever it's put, but it refers to any slice of memory. Usually, that memory is on the GC-allocated heap, but it could be malloced memory or even a slice of a static array on the static. They're resizable simply because the GC will either reallocate memory to resize them (e.g. if the memory is not GC-allocated or because there isn't enough room to expand it in place), or if the memory is GC-allocated, and there's room beyond the end of that slice of memory, then the dynamic array will just be made to refer to a larger slice of that block of memory. > But then arr is a fixed type and we can't use to resize the array > later if needed. > > these don't actually work, for some odd reason, so > > auto arr = cast(T*)malloc(T.sizeof*N)[0..N]; > > seems to be the way to go. > > So, what this "shows" is that in memory, we have > relative address type > 0 T > 1*sizeof(T)T > 2*sizeof(T)T > ... > N-1*sizeof(T) T > > > > This is pretty basic and just standard arrays, all that is fine > and dandy! > > > Now, when it comes to multidimensional arrays: > > T[][] arr; > > There are two ways that the array can be laid out depending on > how we interpret the order of the row/col or col/row. > > > The most natural way to do this is to extend single dimensional > arrays: > > T[][] is defined to be (T[])[] > > or, lets used a fixed array so we can be clear; > > T[N][M] > > which means we have M sequential chunks of memory where each > chunk is a T[N] array. > > This is the natural way because it coincides with single arrays. > > Similary, to access the element at the nth element in the mth > chunk, we do > > t[n][m] because, again, this conforms with out we think of single > arrays. > > > Now, in fact, it doesn't matter too much if we call a row a > column and a column a row(possibly performance, but as far as > dealing with them, as long as we are consistent, everything will > work). > > > BUT! D seems to do something very unique, > > If one defines an array like > > T[N][M] > > > one must access the element as > > t[m][n]! > > The accessors are backwards! > > This is a huge problem! > > int[3][5] a; > > Lets access the last element: > > auto x = a[4][2]; > auto y = a[2][4]; <- the logical way, which is invalid in D > > This method creates confusion and can be buggy. If our array is > not fixed, and we use the *correct* way, then our bugs are at > runtime and maybe subtle. > > Why? Because the correct way only has one thing to get right, > which is being consistent, which is easy. > > In D, we not only have to be consistent, we also have to make > sure to reverse our array accessors from how we defined it. > > While it is a unique approach and may have something to do with > quantum entanglement, I'm curious who the heck came up with the > logic and if there is actually any valid reason? > > Or are we stuck in one of those "Can't change it because it will > break the universe" black holes? It's doing exactly what C/C++ would do if they declared static arrays as T[N][M] arr; instead of T arr[N][M]; The issue is that types are normally read outwards from the variable name. You mostly don't notice it, but it becomes critical to understand when trying to read function pointer types in C/C++. Regardless, a side effect of that is that putting the array lengths on the right like C/C++ does puts them in the order that people expect, whereas putting them on the left like D does makes them seem backwards. It's consistent with how the compiler reads types though. I'm inclined to think that we'd have been better off
D Multidimensional arrays wierdness
I need to get this straight: A normal single dimensional array in D is defined as T[] arr and is a linear sequential memory array of T's with an unbound length and is effectively the same as T*(although D treats them differently?)? We can fix the length by adding a upper bound: T[N] arr; and this is equivalent to auto arr = cast(T[])malloc(T.sizeof*N)[0..N]; possibly auto arr = cast(T[N])malloc(T.sizeof*N); But then arr is a fixed type and we can't use to resize the array later if needed. these don't actually work, for some odd reason, so auto arr = cast(T*)malloc(T.sizeof*N)[0..N]; seems to be the way to go. So, what this "shows" is that in memory, we have relative address type 0 T 1*sizeof(T)T 2*sizeof(T)T ... N-1*sizeof(T) T This is pretty basic and just standard arrays, all that is fine and dandy! Now, when it comes to multidimensional arrays: T[][] arr; There are two ways that the array can be laid out depending on how we interpret the order of the row/col or col/row. The most natural way to do this is to extend single dimensional arrays: T[][] is defined to be (T[])[] or, lets used a fixed array so we can be clear; T[N][M] which means we have M sequential chunks of memory where each chunk is a T[N] array. This is the natural way because it coincides with single arrays. Similary, to access the element at the nth element in the mth chunk, we do t[n][m] because, again, this conforms with out we think of single arrays. Now, in fact, it doesn't matter too much if we call a row a column and a column a row(possibly performance, but as far as dealing with them, as long as we are consistent, everything will work). BUT! D seems to do something very unique, If one defines an array like T[N][M] one must access the element as t[m][n]! The accessors are backwards! This is a huge problem! int[3][5] a; Lets access the last element: auto x = a[4][2]; auto y = a[2][4]; <- the logical way, which is invalid in D This method creates confusion and can be buggy. If our array is not fixed, and we use the *correct* way, then our bugs are at runtime and maybe subtle. Why? Because the correct way only has one thing to get right, which is being consistent, which is easy. In D, we not only have to be consistent, we also have to make sure to reverse our array accessors from how we defined it. While it is a unique approach and may have something to do with quantum entanglement, I'm curious who the heck came up with the logic and if there is actually any valid reason? Or are we stuck in one of those "Can't change it because it will break the universe" black holes?
Re: General performance tip about possibly using the GC or not
On Tuesday, 29 August 2017 at 00:52:11 UTC, Cecil Ward wrote: I don't know when the GC actually gets a chance to run. Another alternative that I *think* (maybe someone who knows a bit more about the gc can chime in?) would work is if you manually stopped the gc then ran collections when profiling shows that your memory usage is high. To get GC functions, "import core.memory". To stop the GC (put this at the top of main()) "GC.disable()". To trigger a collection, "GC.collect()". That way you don't have to manually free everything, there's just one line of code.
Re: C callbacks getting a value of 0! Bug in D?
On Tuesday, 29 August 2017 at 02:47:34 UTC, Johnson Jones wrote: [...] Seems only long and ulong are issues. With respect to the currently major platforms you can reasonable expect software to run on, yes. Just don't try to use D on something with e.g. 32 bit C shorts unless you bind to it via c_short.
Re: C callbacks getting a value of 0! Bug in D?
On Tuesday, 29 August 2017 at 01:56:43 UTC, Moritz Maxeiner wrote: On Tuesday, 29 August 2017 at 01:34:40 UTC, Johnson Jones wrote: [...] produces 4 on both x86 and x64. So, I'm not sure how you are getting 8. There are different 64bit data models [1] and it seems your platform uses LLP64, which uses 32bit longs. Am I correct in assuming you're on Windows (as they are the only major modern platform that I'm aware of that made this choice)? [1] https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models Yes. I found this, which gives a map for all the types: https://dlang.org/spec/interfaceToC.html Seems only long and ulong are issues.
Re: Cpu instructions exposed
On 29/08/2017 2:49 AM, Cecil Ward wrote: I have written a few zero-overhead (fully inlining) D wrappers around certain new x64 instructions as an exercise to help me learn D and get used to GDC asm. I've also written D replacements for older processors. They are templated functions with customised variants supporting a variety of different word-widths. 1. Would anyone find these useful? Bet I'm inventing the wheel? (But still a good learning task for me.) Sure, why not? 2. How best to get them reviewed for correct D-style and Lets talk about 3. 3. how to package them up, expose them? They need to be usable by the caller in such a was as they get fully directly inlined with no subroutine calls or arg passing adaptation overhead so as to get the desired full 100% performance. For example a call with a literal constant argument should continue to mean an immediate operand in the generated code, which happens nicely currently in my testbeds. So I don't know, the user needs to see the lib fn _source_ or some equivalent GDC cleverness. (Like entire thing in a .h file. Yes, I know, I know. :-) ) Dub + force -inline. Also you will need to support ldc and dmd. 4. I would like to do the same for LDC, unfortunately the asm system is rather different from GDC. I don't know if there is anything clever I can do to try to avoid duplication of effort / totally split sources and double maintenance? (Desperation? Preprocess the D sources with an external tool if all else fails! Yuck. Don't have one at hand right now anyway.) Duplicate. Nothing wrong with that for such little code. Its already abstracted nicely out. As long as you are using function arguments as part of you iasm, push/pop registers you use, it should be inlined correctly as per the arguments. But I'd like to see some code before making any other remarks. I highly suggest you hang out on IRC (#d Freenode) to help get interactive reviews+suggestions.
Re: C callbacks getting a value of 0! Bug in D?
On Tuesday, 29 August 2017 at 01:34:40 UTC, Johnson Jones wrote: import core.stdc.config; pragma(msg, c_long.sizeof); prints 4UL both on x64 and x86 and and C: void foo() { int dummy; switch (dummy) { case sizeof(long) : case sizeof(long) : break; } } produces 4 on both x86 and x64. So, I'm not sure how you are getting 8. It's because you're on Windows. There, long/ulong are 4 bytes in both 32- and 64-bit. On Linux/Mac/*BSD, they're 4 in 32-bit and 8 in 64-bit. This is why we have c_long and c_ulong, to hide those differences.
Re: C callbacks getting a value of 0! Bug in D?
On Tuesday, 29 August 2017 at 01:34:40 UTC, Johnson Jones wrote: [...] produces 4 on both x86 and x64. So, I'm not sure how you are getting 8. There are different 64bit data models [1] and it seems your platform uses LLP64, which uses 32bit longs. Am I correct in assuming you're on Windows (as they are the only major modern platform that I'm aware of that made this choice)? [1] https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models
Cpu instructions exposed
I have written a few zero-overhead (fully inlining) D wrappers around certain new x64 instructions as an exercise to help me learn D and get used to GDC asm. I've also written D replacements for older processors. They are templated functions with customised variants supporting a variety of different word-widths. 1. Would anyone find these useful? Bet I'm inventing the wheel? (But still a good learning task for me.) 2. How best to get them reviewed for correct D-style and 3. how to package them up, expose them? They need to be usable by the caller in such a was as they get fully directly inlined with no subroutine calls or arg passing adaptation overhead so as to get the desired full 100% performance. For example a call with a literal constant argument should continue to mean an immediate operand in the generated code, which happens nicely currently in my testbeds. So I don't know, the user needs to see the lib fn _source_ or some equivalent GDC cleverness. (Like entire thing in a .h file. Yes, I know, I know. :-) ) 4. I would like to do the same for LDC, unfortunately the asm system is rather different from GDC. I don't know if there is anything clever I can do to try to avoid duplication of effort / totally split sources and double maintenance? (Desperation? Preprocess the D sources with an external tool if all else fails! Yuck. Don't have one at hand right now anyway.) Is there any way I could get D to actually generate some D code to help with that? I have seen some pretty mind-blowing stuff in D using mixin or something - looks fantastic, just like the power of our old friends the evil unconstrained C macros that can generate random garbage C source text without limit, but in D it's done right so the D source can actually be parsed properly, no two languages fighting. I recall using this kind of source generation for dealing with lots of different operator-overloading routines that all follow a similar pattern. Can't think where else. I don't know what is available and what the limits of various techniques are. I'm wondering if I could get D to internally generate GDC-specific or LDC-specific source code strings - the two asm frameworks are syntactically different iirc - starting from a friendly generic neutral format, transforming it somehow. (If memory serves, I think GDC uses a non-D extended syntax, very close to asm seen in GCC for C, for easier partial re-use of snippets from C sources. On the other hand LDC looks more like standard D with complex template expansion, but I haven't studied it properly.) Any general tips to point me in the right direction, much appreciated.
Re: General performance tip about possibly using the GC or not
On Tuesday, 29 August 2017 at 00:52:11 UTC, Cecil Ward wrote: I am vacillating - considering breaking a lifetime's C habits and letting the D garbage collector make life wonderful by just cleaning up after me and ruining my future C disciple by not deleting stuff myself. It's not a panacea, but it's also not the boogyeman some people make it out to be. You can let the GC do it's thing most of the time and not worry about it. For the times when you do need to worry about it, there are tools available to mitigate its impact. I don't know when the GC actually gets a chance to run. Only when memory is allocated from the GC, such as when you allocate via new, or use a buit-in language feature that implicitly allocates (like array concatenation). And then, it only runs if it needs to. I am wondering if deleting the usual bothersome immediately-executed hand-written cleanup code could actually improve performance in a sense in some situations. If the cleanup is done later by the GC, then this might be done when the processor would otherwise be waiting for io, in the top loop of an app, say? And if so this would amount to moving the code to be run effectively like 'low priority' app-scheduled activities, when the process would be waiting anyway, so moving cpu cycles to a later time when it doesn't matter. Is this a reasonable picture? When programming to D's GC, some of the same allocation strategies you use in C still apply. For example, in C you generally wouldn't allocate multiple objects in critical loop because allocations are not cheap -- you'd preallocate them, possibly on the stack, before entering the loop. That same strategy is a win in D, but for a different reason -- if you don't allocate anything from the GC heap in the loop, then the GC won't run in the loop. Multiple threads complicate the picture a bit. A background thread might trigger a GC collection when you don't want it to, but it's still possible to mitigate the impact. This is the sort of thing that isn't necessary to concern yourself with in the general case, but that you need to be aware of so you can recognize it when it happens. An example that I found interesting was the one Funkwerk encountered when the GC was causing their server to drop connections [1]. If I carry on deleting objects / freeing / cleaning up as I'm used to, without disabling the GC, am I just slowing my code down? Plus (for all I know) the GC will use at least some battery or possibly actually important cpu cycles in scanning and finding nothing to do all the time because I've fully cleaned up. You generally don't delete or free GC-allocated memory. You can call destroy on GC-allocated objects, but that just calls the destructor and doesn't trigger a collection. And whatever you do with the C heap isn't going to negatively impact GC performance. You can trigger a collection by calling GC.collect. That's a useful tool in certain circumstances, but it can also hurt performance by forcing collections when they aren't needed. The two fundamental mitigation strategies that you can follow in the general case: 1.) minimize the number of allocations and 2.) keep the size of allocations as small as possible. The first decreases the number of opportunities for a collection to occur, the second helps keep collection times shorter. That doesn't mean you should always work to avoid the GC, just be smart about how and when you allocate just as you would in C and C++. I suppose there might also be a difference in cache-friendliness as cleaning up immediately by hand might be working on hot memory, but the GC scanner coming along much later might have to deal with cold memory, but it may not matter if the activity is app-scheduled like low priority work or is within time periods that are merely eating into io-bound wait periods anyway. I definitely need to read up on this. Have never used a GC language, just decades of C and mountains of asm. You might start with the GC series on the D Blog [2]. The next post (Go Your Own Way Part Two: The Heap) is coming some time in the next couple of weeks. Any general guidance on how to optimise cpu usage particularly responsiveness. If it works for C, it works for D. Yes, the GC can throw you into a world of cache misses, but again, smart allocation strategies can minimize the impact. Having worked quite a bit with C, Java, and D, my sense is it's best to treat D more like C than Java. Java programmers have traditionally had little support for optimizing cache usage (there are libraries out there now that can help, and I hear there's movement to finally bring value type aggregates to the language), and with the modern GC implementations as good as they are it's recommended to avoid the strategies of the past (such as pooling and reusing objects) in favor of allocating as needed. In D, you have the tools to optimize cache usage (such
Re: C callbacks getting a value of 0! Bug in D?
On Tuesday, 29 August 2017 at 00:42:45 UTC, Steven Schveighoffer wrote: On 8/28/17 7:47 PM, Johnson Jones wrote: [...] Then I think possibly the port audio bindings are not correct. It's also possible that long is not 64-bit even on the platform you are using. Finally, it's also possible that the D compiler is not generating the call correctly. When I test on my mac, c_long is 8-bytes. I'm not sure what it is in your case, try this code and see what happens: import core.stdc.config; pragma(msg, c_long.sizeof); // prints 8 on my box. And in c: #include int main() { printf("%lu\n", sizeof(long)); // also prints 8 on my box return 0; } They should match. If they don't, that's a D bug (in either core.stdc.config or the compiler, I'm not sure which). If they do, then one of the other two possibilities is happening, and I would lean towards the bindings being incorrect. -Steve import core.stdc.config; pragma(msg, c_long.sizeof); prints 4UL both on x64 and x86 and and C: void foo() { int dummy; switch (dummy) { case sizeof(long) : case sizeof(long) : break; } } produces 4 on both x86 and x64. So, I'm not sure how you are getting 8.
Re: General performance tip about possibly using the GC or not
On 08/28/2017 06:25 PM, Ali Çehreli wrote: I don't like the current format of the page Apparently, I was looking for this one: https://dlang.org/blog/the-gc-series/ Ali
Re: General performance tip about possibly using the GC or not
I don't like the current format of the page (all articles are expanded as opposed to being an index page) but there are currently four D blog articles on GC and memory management: https://dlang.org/blog/category/gc/ Ali
Re: General performance tip about possibly using the GC or not
On Tuesday, August 29, 2017 00:52:11 Cecil Ward via Digitalmars-d-learn wrote: > I am vacillating - considering breaking a lifetime's C habits and > letting the D garbage collector make life wonderful by just > cleaning up after me and ruining my future C disciple by not > deleting stuff myself. > > I don't know when the GC actually gets a chance to run. Normally, it's only run when you call new. When you call new, if it thinks that it needs to do a collection to free up some space, then it will. Otherwise, it won't normally ever run, because it's not sitting in its own thread like happens with Java or C#. However, if you need it to run at a particular time, you can call core.memory.GC.collect to explicitly tell it to run a collection. Similarly, you can call GC.disable to make it so that a section of code won't cause any collections (e.g. in a performance critical loop that can't afford for the GC to kick in), and then you can call GC.enable to turn it back on again. > I am wondering if deleting the usual bothersome > immediately-executed hand-written cleanup code could actually > improve performance in a sense in some situations. If the cleanup > is done later by the GC, then this might be done when the > processor would otherwise be waiting for io, in the top loop of > an app, say? And if so this would amount to moving the code to be > run effectively like 'low priority' app-scheduled activities, > when the process would be waiting anyway, so moving cpu cycles to > a later time when it doesn't matter. Is this a reasonable picture? > > If I carry on deleting objects / freeing / cleaning up as I'm > used to, without disabling the GC, am I just slowing my code > down? Plus (for all I know) the GC will use at least some battery > or possibly actually important cpu cycles in scanning and finding > nothing to do all the time because I've fully cleaned up. > > I suppose there might also be a difference in cache-friendliness > as cleaning up immediately by hand might be working on hot > memory, but the GC scanner coming along much later might have to > deal with cold memory, but it may not matter if the activity is > app-scheduled like low priority work or is within time periods > that are merely eating into io-bound wait periods anyway. > > I definitely need to read up on this. Have never used a GC > language, just decades of C and mountains of asm. For a lot of stuff, GCs will actually be faster. It really depends on what your code is doing. One aspect of this is that when you're doing manual memory management or reference counting, you're basically spreading out the collection across the program. It's costing you all over the place but isn't necessarily costing a lot in any particular place. The GC on the other hand avoids a lot of that cost as you're running, because your program isn't constantly doing all of that work to free stuff up - but when the GC does kick in to do a collection, then it costs a lot more for that moment than any particular freeing of memory would have cost with manual memory management. It's doing all of that work at once rather than spreading it out. Whether that results in a more performant program or a less performant program depends a lot on what you're doing and what your use case can tolerate. For most programs, having the GC stop stuff temporarily really doesn't matter at all, whereas for something like a real-time program, it would be fatal. So, it really depends on what you're doing. Ultimately, for most programs, it makes the most sense to just use the GC and optimize your program where it turns out to be necessary. That could mean disabling the GC in certain sections of code, or it could mean managing certain memory manually, because it's more efficient to do so in that case. Doing stuff like allocating a lot of small objects and throwing them away will definitely be a performance problem for the GC, but it's not all that great for manual memory management either. A lot of the performance gains come from doing stuff on the stack where possible, which is one area where ranges tend to shine. Another thing to consider is that some programs will need to have specific threads not managed by the GC so that they can't be stopped during a collection (e.g. a program with an audio pipeline will probably not want that on a thread that's GC-managed), and that's one way to avoid a performance hit from the GC. That's a fairly atypical need though, much as it's critical for certain types of programs. All in all, switching to using the GC primarily will probably take a bit of a shift in thinking, but typical D idioms do tend to reduce the need for memory management in general and reduce the negative impacts that can come with garbage collection. And ultimately, some workloads will be more efficient with the GC. It's my understanding that relatively few programs end up needing to play games where they do things like disable the GC temporarily, but the tools are there if you need
Re: General performance tip about possibly using the GC or not
D's GC is stop the world (aka all threads) and does not run on its own (requires being asked to collect). It is only given the opportunity to collect when you allocate (new/more) memory. It can decide not to, or to do so at any point making it very unpredictable. This is why we keep saying that it is not a magic bullet. It isn't. It just does a simple set of logic and nothing more.
General performance tip about possibly using the GC or not
I am vacillating - considering breaking a lifetime's C habits and letting the D garbage collector make life wonderful by just cleaning up after me and ruining my future C disciple by not deleting stuff myself. I don't know when the GC actually gets a chance to run. I am wondering if deleting the usual bothersome immediately-executed hand-written cleanup code could actually improve performance in a sense in some situations. If the cleanup is done later by the GC, then this might be done when the processor would otherwise be waiting for io, in the top loop of an app, say? And if so this would amount to moving the code to be run effectively like 'low priority' app-scheduled activities, when the process would be waiting anyway, so moving cpu cycles to a later time when it doesn't matter. Is this a reasonable picture? If I carry on deleting objects / freeing / cleaning up as I'm used to, without disabling the GC, am I just slowing my code down? Plus (for all I know) the GC will use at least some battery or possibly actually important cpu cycles in scanning and finding nothing to do all the time because I've fully cleaned up. I suppose there might also be a difference in cache-friendliness as cleaning up immediately by hand might be working on hot memory, but the GC scanner coming along much later might have to deal with cold memory, but it may not matter if the activity is app-scheduled like low priority work or is within time periods that are merely eating into io-bound wait periods anyway. I definitely need to read up on this. Have never used a GC language, just decades of C and mountains of asm. Any general guidance on how to optimise cpu usage particularly responsiveness. One pattern I used to use when writing service processes (server apps) is that of deferring compute tasks by using a kind of 'post this action' which adds an entry into a queue, the entry is a function address plus arg list and represents work to be done later. In the top loop, the app then executes these 'posted' jobs later at app-scheduled low priority relative to other activities and all handling of io and timer events, when it has nothing else to do, by simply calling through the function pointer in a post queue entry. So it's a bit like setting a timer for 0 ms, passing a callback function. Terminology - A DFC or lazy, late execution might be other terms. I'm wondering if using the garbage collector well might fit into this familiar pattern? That fair? And actually even help peformance for me if I'm lucky?
Re: C callbacks getting a value of 0! Bug in D?
On 8/28/17 7:47 PM, Johnson Jones wrote: On Monday, 28 August 2017 at 21:35:27 UTC, Steven Schveighoffer wrote: On 8/27/17 10:17 PM, Johnson Jones wrote: Looking at the assembly shows something like this: 0041ea98 push 0x0 0041ea9a push 0x0 0041ea9c push 0x0 0041ea9e push dword 0x100 0041eaa3 mov ecx, [typeid(PaStreamParameters)+0xe36fc (0x80d4cc)] 0041eaa9 mov eax, [fs:0x2c] 0041eaaf mov edx, [eax+ecx*4] 0041eab2 push dword [edx+0x1c] 0041eab8 push dword [edx+0x18] 0041eabe push dword [ebp-0x54] 0041eac1 push dword [ebp-0x5c] 0041eac4 mov ebx, PA.stream (0x823f30) 0041eac9 push ebx 0041eaca call dword near [Pa_OpenStream (0x823f18)] I noticed that those 0's were the values being fed in to the function. I remember converting c_ulong's to ulong's and that they were probably uint's in D. Converting those fixed the problem and the callback is now called! I converted all the ulongs to uint's but there were a few longs and I don't know if they are c_longs or d_longs... Anyways, At least I'm on the right track. For C/C++ interaction, always use c_... types if they are available. The idea is both that they will be correctly defined for the width, and also it will mangle correctly for C++ compilers (yes, long and int are mangled differently even when they are the same thing). In portaudio, this doesn't seem to be the case. I changed all the longs to ints and ran my code in x64 and it worked fine. It may just be that the stuff that uses long is not used in my code. In port audio I see it using unsigned long and so this should be 64-bits in x64. Surprised it worked. Maybe conversion is taking place or the high bits of the long are 0'ed and so there is not much difference. Then I think possibly the port audio bindings are not correct. It's also possible that long is not 64-bit even on the platform you are using. Finally, it's also possible that the D compiler is not generating the call correctly. When I test on my mac, c_long is 8-bytes. I'm not sure what it is in your case, try this code and see what happens: import core.stdc.config; pragma(msg, c_long.sizeof); // prints 8 on my box. And in c: #include int main() { printf("%lu\n", sizeof(long)); // also prints 8 on my box return 0; } They should match. If they don't, that's a D bug (in either core.stdc.config or the compiler, I'm not sure which). If they do, then one of the other two possibilities is happening, and I would lean towards the bindings being incorrect. -Steve
Re: Output range with custom string type
On Monday, 28 August 2017 at 14:27:19 UTC, Jacob Carlborg wrote: I'm working on some code that sanitizes and converts values of different types to strings. I thought it would be a good idea to wrap the sanitized string in a struct to have some type safety. Ideally it should not be possible to create this type without going through the sanitizing functions. The problem I have is that I would like these functions to push up the allocation decision to the caller. Internally these functions use formattedWrite. I thought the natural design would be that the sanitize functions take an output range and pass that to formattedWrite. Here's a really simple example: import std.stdio : writeln; struct Range { void put(char c) { writeln(c); } } void sanitize(OutputRange)(string value, OutputRange range) { import std.format : formattedWrite; range.formattedWrite!"'%s'"(value); } void main() { Range range; sanitize("foo", range); } The problem now is that the data is passed one char at the time to the range. Meaning that if the user implements a custom output range, the user is in full control of the data. It will now be very easy for the user to make a mistake or manipulate the data on purpose. Making the whole idea of the sanitized type pointless. Any suggestions how to fix this or a better idea? Q is it an option to let the caller provide all the storage in an oversized fixed-length buffer? You could add a second helper function to compute and return a suitable safely pessimistic ott max value for the length reqd which could be called once beforehand to establish the reqd buffer size (or check it). This is the technique I am using right now. My sizing function is ridiculously fast as I am lucky in the particular use-case.
Re: C callbacks getting a value of 0! Bug in D?
On Monday, 28 August 2017 at 21:35:27 UTC, Steven Schveighoffer wrote: On 8/27/17 10:17 PM, Johnson Jones wrote: Looking at the assembly shows something like this: 0041ea98 push 0x0 0041ea9a push 0x0 0041ea9c push 0x0 0041ea9e push dword 0x100 0041eaa3 mov ecx, [typeid(PaStreamParameters)+0xe36fc (0x80d4cc)] 0041eaa9 mov eax, [fs:0x2c] 0041eaaf mov edx, [eax+ecx*4] 0041eab2 push dword [edx+0x1c] 0041eab8 push dword [edx+0x18] 0041eabe push dword [ebp-0x54] 0041eac1 push dword [ebp-0x5c] 0041eac4 mov ebx, PA.stream (0x823f30) 0041eac9 push ebx 0041eaca call dword near [Pa_OpenStream (0x823f18)] I noticed that those 0's were the values being fed in to the function. I remember converting c_ulong's to ulong's and that they were probably uint's in D. Converting those fixed the problem and the callback is now called! I converted all the ulongs to uint's but there were a few longs and I don't know if they are c_longs or d_longs... Anyways, At least I'm on the right track. For C/C++ interaction, always use c_... types if they are available. The idea is both that they will be correctly defined for the width, and also it will mangle correctly for C++ compilers (yes, long and int are mangled differently even when they are the same thing). -Steve In portaudio, this doesn't seem to be the case. I changed all the longs to ints and ran my code in x64 and it worked fine. It may just be that the stuff that uses long is not used in my code. In port audio I see it using unsigned long and so this should be 64-bits in x64. Surprised it worked. Maybe conversion is taking place or the high bits of the long are 0'ed and so there is not much difference. Anyways, I guess I'll deal with any of those bugs when I run in to them, if they exist.
Re: Accessing outer class attribute from inner struct
On Monday, 28 August 2017 at 22:47:12 UTC, Andre Pany wrote: On Monday, 28 August 2017 at 22:28:18 UTC, Moritz Maxeiner wrote: On Monday, 28 August 2017 at 21:52:58 UTC, Andre Pany wrote: [...] To make my question short:) If ColumnsArray is a class I can access the attribute "reference" but not if it is a struct. I would rather prefer a struct, but with a struct it seems I cannot access "reference". How can I access "reference" from my inner struct? [...] Add an explicit class reference member to to it: --- class TCustomGrid: TCustomPresentedScrollBox { struct ColumnsArray { TCustomGrid parent; TColumn opIndex(int index) { int r = getIntegerIndexedPropertyReference(reference, "Columns", index); return new TColumn(r); } } ColumnsArray Columns; this() { Columns = ColumnsArray(this); } ... } --- Nesting structs inside anything other than functions[1] is for visibility/protection encapsulation and namespacing only. [1] non-static structs in functions are special as they have access to the surrounding stack frame Unfortunately thats not possible. ColumnsArray and the attribute will become a string mixin to avoid boilerplate. It would be error prone if I have to initialize them in the constructor too. I want just 1 single coding line for this property. That is also the reason I do not want to use a class, as I would have to initialize them in the constructor. --- class C { struct S { } S s; } --- is semantically equivalent to --- struct S { } class C { S s; } --- with the two differences being - namespacing (outside of C one has to use C.S to access S) - you can protect the visibility of the S from outside the module C resides in via private,public, etc. In both cases S doesn't inherently how about C, which means a solution using default initialization is not feasible, as S.init can't know about any particular instance of C. I don't think there's any way for you to avoid using a class constructor.
Re: testing for deprecation
On Monday, 28 August 2017 at 21:29:31 UTC, Jonathan M Davis wrote: I think that it's pretty clear that a new traits for __traits would be required. Per the documentation, getFunctionAttributes does not include anything about deprecation, and even if it did, it wouldn't be sufficient anyway, because it would only cover functions, whereas almost any symbol that isn't local to a function can be deprecated (the only case I can think of at the moment where you can't deprecate a symbol that isn't inside a function is enum members, which can't be individually deprecated, because you can't apply any attributes to them individually). We'd probably need something like __traits(isDeprecated, symbol). https://issues.dlang.org/show_bug.cgi?id=17791 - Jonathan M Davis Thanks for filing that!
Re: C callbacks getting a value of 0! Bug in D?
On Monday, 28 August 2017 at 22:41:56 UTC, Moritz Maxeiner wrote: On Monday, 28 August 2017 at 22:21:18 UTC, Johnson Jones wrote: On Monday, 28 August 2017 at 21:35:27 UTC, Steven Schveighoffer wrote: [...] and where are these c_ types defined? The reason I replaced them was precisely because D was not finding them. core.stdc.config , which unfortunately doesn't appear in the online documentation AFAICT (something that ought to be fixed). A common workaround is to use pattern searching tools like grep if you know the phrase to look for: $ grep -Er c_long /path/to/imports , or in this case, since these things are usually done with aliases: $ grep -Er 'alias\s+\w*\s+c_long' /path/to/imports Thanks. I copied over stuff from the bindings and from the original header and I guess I missed the import.
Re: Accessing outer class attribute from inner struct
On Monday, 28 August 2017 at 22:28:18 UTC, Moritz Maxeiner wrote: On Monday, 28 August 2017 at 21:52:58 UTC, Andre Pany wrote: [...] To make my question short:) If ColumnsArray is a class I can access the attribute "reference" but not if it is a struct. I would rather prefer a struct, but with a struct it seems I cannot access "reference". How can I access "reference" from my inner struct? [...] Add an explicit class reference member to to it: --- class TCustomGrid: TCustomPresentedScrollBox { struct ColumnsArray { TCustomGrid parent; TColumn opIndex(int index) { int r = getIntegerIndexedPropertyReference(reference, "Columns", index); return new TColumn(r); } } ColumnsArray Columns; this() { Columns = ColumnsArray(this); } ... } --- Nesting structs inside anything other than functions[1] is for visibility/protection encapsulation and namespacing only. [1] non-static structs in functions are special as they have access to the surrounding stack frame Unfortunately thats not possible. ColumnsArray and the attribute will become a string mixin to avoid boilerplate. It would be error prone if I have to initialize them in the constructor too. I want just 1 single coding line for this property. That is also the reason I do not want to use a class, as I would have to initialize them in the constructor. Kind regards André
Re: C callbacks getting a value of 0! Bug in D?
On Monday, 28 August 2017 at 22:21:18 UTC, Johnson Jones wrote: On Monday, 28 August 2017 at 21:35:27 UTC, Steven Schveighoffer wrote: On 8/27/17 10:17 PM, Johnson Jones wrote: [...] For C/C++ interaction, always use c_... types if they are available. The idea is both that they will be correctly defined for the width, and also it will mangle correctly for C++ compilers (yes, long and int are mangled differently even when they are the same thing). -Steve and where are these c_ types defined? The reason I replaced them was precisely because D was not finding them. core.stdc.config , which unfortunately doesn't appear in the online documentation AFAICT (something that ought to be fixed). A common workaround is to use pattern searching tools like grep if you know the phrase to look for: $ grep -Er c_long /path/to/imports , or in this case, since these things are usually done with aliases: $ grep -Er 'alias\s+\w*\s+c_long' /path/to/imports
Re: Accessing outer class attribute from inner struct
On Monday, 28 August 2017 at 21:52:58 UTC, Andre Pany wrote: [...] To make my question short:) If ColumnsArray is a class I can access the attribute "reference" but not if it is a struct. I would rather prefer a struct, but with a struct it seems I cannot access "reference". How can I access "reference" from my inner struct? [...] Add an explicit class reference member to to it: --- class TCustomGrid: TCustomPresentedScrollBox { struct ColumnsArray { TCustomGrid parent; TColumn opIndex(int index) { int r = getIntegerIndexedPropertyReference(reference, "Columns", index); return new TColumn(r); } } ColumnsArray Columns; this() { Columns = ColumnsArray(this); } ... } --- Nesting structs inside anything other than functions[1] is for visibility/protection encapsulation and namespacing only. [1] non-static structs in functions are special as they have access to the surrounding stack frame
Re: C callbacks getting a value of 0! Bug in D?
On Monday, 28 August 2017 at 21:35:27 UTC, Steven Schveighoffer wrote: On 8/27/17 10:17 PM, Johnson Jones wrote: [...] For C/C++ interaction, always use c_... types if they are available. The idea is both that they will be correctly defined for the width, and also it will mangle correctly for C++ compilers (yes, long and int are mangled differently even when they are the same thing). -Steve and where are these c_ types defined? The reason I replaced them was precisely because D was not finding them.
Accessing outer class attribute from inner struct
Hi, I build some framework to access Delphi components from D. Delphi supports property array access "StringGrid1.Columns[2]" which is translated in Delphi to a private method call "GetColumn(2)". I need to imitate this behavior in my D code. Therefore my TCustomGrid class has a inner struct ColumnsArray with an opIndex. While accessing opIndex I need to call a DLL method. Therefore I need the "reference" attribute which is available in my TCustomGrid via inheritance. To make my question short:) If ColumnsArray is a class I can access the attribute "reference" but not if it is a struct. I would rather prefer a struct, but with a struct it seems I cannot access "reference". How can I access "reference" from my inner struct? class TCustomGrid: TCustomPresentedScrollBox { struct ColumnsArray { TColumn opIndex(int index) { int r = getIntegerIndexedPropertyReference(reference, "Columns", index); return new TColumn(r); } } ColumnsArray Columns; ... } Kind regards André
Re: Output range with custom string type
On Monday, 28 August 2017 at 14:27:19 UTC, Jacob Carlborg wrote: I'm working on some code that sanitizes and converts values of different types to strings. I thought it would be a good idea to wrap the sanitized string in a struct to have some type safety. Ideally it should not be possible to create this type without going through the sanitizing functions. The problem I have is that I would like these functions to push up the allocation decision to the caller. Internally these functions use formattedWrite. I thought the natural design would be that the sanitize functions take an output range and pass that to formattedWrite. [...] Any suggestions how to fix this or a better idea? If you want the caller to be just in charge of allocation, that's what std.experimental.allocator provides. In this case, I would polish up the old "format once to get the length, allocate, format second time into allocated buffer" method used with snprintf for D: --- test.d --- import std.stdio; import std.experimental.allocator; struct CountingOutputRange { private: size_t _count; public: size_t count() { return _count; } void put(char c) { _count++; } } char[] sanitize(string value, IAllocator alloc) { import std.format : formattedWrite, sformat; CountingOutputRange r; ().formattedWrite!"'%s'"(value); // do not copy the range auto s = alloc.makeArray!char(r.count); scope (failure) alloc.dispose(s); // This should only throw if the user provided allocator returned less // memory than was requested return s.sformat!"'%s'"(value); } void main() { auto s = sanitize("foo", theAllocator); scope (exit) theAllocator.dispose(s); writeln(s); } --
Re: C callbacks getting a value of 0! Bug in D?
On 8/27/17 10:17 PM, Johnson Jones wrote: Looking at the assembly shows something like this: 0041ea98 push 0x0 0041ea9a push 0x0 0041ea9c push 0x0 0041ea9e push dword 0x100 0041eaa3 mov ecx, [typeid(PaStreamParameters)+0xe36fc (0x80d4cc)] 0041eaa9 mov eax, [fs:0x2c] 0041eaaf mov edx, [eax+ecx*4] 0041eab2 push dword [edx+0x1c] 0041eab8 push dword [edx+0x18] 0041eabe push dword [ebp-0x54] 0041eac1 push dword [ebp-0x5c] 0041eac4 mov ebx, PA.stream (0x823f30) 0041eac9 push ebx 0041eaca call dword near [Pa_OpenStream (0x823f18)] I noticed that those 0's were the values being fed in to the function. I remember converting c_ulong's to ulong's and that they were probably uint's in D. Converting those fixed the problem and the callback is now called! I converted all the ulongs to uint's but there were a few longs and I don't know if they are c_longs or d_longs... Anyways, At least I'm on the right track. For C/C++ interaction, always use c_... types if they are available. The idea is both that they will be correctly defined for the width, and also it will mangle correctly for C++ compilers (yes, long and int are mangled differently even when they are the same thing). -Steve
Re: testing for deprecation
On Monday, August 28, 2017 13:08:04 jmh530 via Digitalmars-d-learn wrote: > On Saturday, 26 August 2017 at 07:17:49 UTC, user1234 wrote: > > getAttributes is made for UDAs only. > > Okay, well if you change it to > > deprecated { > void foo(); > } > > void main() { > pragma(msg, __traits(getFunctionAttributes, foo)); > } > > then you just get > > tuple(@system) > > so the issue still stands. I see no way to loop through members > of a module at compile-time and exclude the ones that are > deprecated. I think that it's pretty clear that a new traits for __traits would be required. Per the documentation, getFunctionAttributes does not include anything about deprecation, and even if it did, it wouldn't be sufficient anyway, because it would only cover functions, whereas almost any symbol that isn't local to a function can be deprecated (the only case I can think of at the moment where you can't deprecate a symbol that isn't inside a function is enum members, which can't be individually deprecated, because you can't apply any attributes to them individually). We'd probably need something like __traits(isDeprecated, symbol). https://issues.dlang.org/show_bug.cgi?id=17791 - Jonathan M Davis
Protection attribute in another module
How do I get the protection status of function in another module? Basically I have some code that loops through all the members of another module and I want to be able to skip the ones that are private. The code below prints public for foo. module A; private void foo(); module B; import A; void main() { import std.stdio : writeln; foreach(member; __traits(allMembers, A)) { writeln(member); writeln(__traits(getProtection, member)); } }
Re: How do I create a fileWatcher with an onFileChange event using spawn?
On Monday, 28 August 2017 at 06:27:20 UTC, Jacob Carlborg wrote: On 2017-08-25 23:25, Enjoys Math wrote: Something like this: module file_watcher; import std.concurrency; import std.file; import std.signals; import std.datetime; void fileWatcher(Tid tid, string filename, int loopSleep) { auto modified0 = timeLastModified(filename); while (true) { modified = timeLastModified(filename); if (modified > modified0) { modified0 = modified; //if (onFileChange !is null) // onFileChange(receiver); } sleep(dur!"msecs"(loopSleep)); } } But I'm not sure how to send the onFiledChange event. A delegate perhaps? Or you can look at any of the existing event driven libraries that do this: http://code.dlang.org/packages/vibe-core http://code.dlang.org/packages/libasync No a plain delegate won't work. There's something you're not telling me because I've tried delegates. They have to be shared or something, and that causes a big mess with my code.
Output range with custom string type
I'm working on some code that sanitizes and converts values of different types to strings. I thought it would be a good idea to wrap the sanitized string in a struct to have some type safety. Ideally it should not be possible to create this type without going through the sanitizing functions. The problem I have is that I would like these functions to push up the allocation decision to the caller. Internally these functions use formattedWrite. I thought the natural design would be that the sanitize functions take an output range and pass that to formattedWrite. Here's a really simple example: import std.stdio : writeln; struct Range { void put(char c) { writeln(c); } } void sanitize(OutputRange)(string value, OutputRange range) { import std.format : formattedWrite; range.formattedWrite!"'%s'"(value); } void main() { Range range; sanitize("foo", range); } The problem now is that the data is passed one char at the time to the range. Meaning that if the user implements a custom output range, the user is in full control of the data. It will now be very easy for the user to make a mistake or manipulate the data on purpose. Making the whole idea of the sanitized type pointless. Any suggestions how to fix this or a better idea? -- /Jacob Carlborg
Re: fasta parser with iopipe?
On Wednesday, 23 August 2017 at 13:06:36 UTC, Steven Schveighoffer wrote: On 8/23/17 5:53 AM, biocyberman wrote: [...] I'll respond to all your questions with what I would do, instead of answering each one. I would suggest an approach similar to how I approached parsing JSON data. In your case, the protocol is even simpler, as there is no nesting. 1. The base layer iopipe should be something that tokenizes the input into reference-based structs. If you look at the jsoniopipe library (https://github.com/schveiguy/jsoniopipe), you can see that the lowest level finds the start of the next JSON token. In your case, it should be looking for >[...] This code is pretty straightforward, and roughly corresponds to this: while(cannot find start sequence in stream) stream.extend; make sure you aren't re-doing work that has already been done (i.e. save the last place you looked). Once you have this, you can deduce each packet by the data between the starts. 2. The next layer should validate and parse the data into structs that contain referencing data from the buffer. I recommend not using actual ranges from the buffer, but information on how to build the ranges. The reason for this is that the buffer can move while being streamed by iopipe, so your data could become invalid if you take actual references to the buffer. If you look in the jsoniopipe library, the struct for storing a json item has a start and length, but not a reference to the buffer. Potentially, you could take this mechanism and build an iopipe on top of the buffered data. This iopipe's elements would be the items themselves, with the underlying buffer hidden in the implementation details. Extending would parse out another set of items, releasing would allow those items to get reclaimed (and the underlying stream data). This is something I actually wanted to explore with jsoniopipe but didn't have time before the conference. I probably will still build it. 3. build your real code on top of that layer. What do you want to do with the data? Easiest thing to do for proof of concept is build a range out of the functions. That can allow you to test performance with your lower layers. One of the awesome things about iopipe is testing correctness is really easy -- every string is also an iopipe :) I actually worked with a person at dconf on a similar (maybe identical?) format and explained how it could be done in a very similar way. He was looking to remove data that had a low percentage of correctness (or something like that, not in bioinformatics, so I don't understand the real semantics). With this mechanism in hand, the decompression is pretty easy to chain together with whatever actual stream you have, just use iopipe.zip. Good luck, and email me if you need more help (schvei...@yahoo.com). -Steve Hi Nic and Steve Thank you both very much for your inputs. I am trying to make use of them. I will try to adapt jsoniopipe for fasta. This is on going and broken code: https://github.com/biocyberman/fastaq . PRs are welcome. @Nic: I am too very interested in bringing D to bioinformatics. I will be happy to share information I have. Feel free to email me at vql(.at.)rn.dk and we talk further about it. @Steve: Yes we talked at dconf 2017. I had to other things so D learning got slow down. I am trying with Fasta format before jumping to Fastq again. The jsoniopipe is full feature, and relatively small project, which can be used to study case. However there are some aspects I still haven't fully understood. Would I be lucky enough to have you make the current broken code of fastaq to work? :) That will definitely save me time and headache dealing with newbie problems.
Re: No CTFE of function
On Saturday, 26 August 2017 at 16:52:36 UTC, Cecil Ward wrote: I have a pure function that has constant inputs, known at compile-time, contains no funny stuff internally - looked at the generated code, and no RTL calls at all. But in a test call with constant literal values (arrays initialised to literal) passed to the pure routine GDC refuses to CTFE the whole thing, as I would expect it (based on previous experience with d and GDC) to simply generate a trivial function that puts out a block of CTFE-evaluated constant data corresponding to the input. Unfortunately it's a bit too long to post in here. I've tried lots of variations. Function is marked nogc safe pure nothrow Any ideas as to why GDC might just refuse to do CTFE on compile-time-known inputs in a truly pure situation? Haven't tried DMD yet. Can try LDC. Am using d.godbolt.org to look at the result, as I don't have a machine here to run a d compiler on. Other things I can think of. Contains function-in-a-function calls, which are all unlined out in the generated code nicely, and not the first time I've done that with GDC either. Switches: Am using -Os or -O2 or -O3 - tried all. Tuning to presume + enable the latest x86-64 instructions. release build, no bounds-checks. I will henceforth use the enum trick advice all times. I noticed that the problem with init =void is compiler-dependent. Using an enum for real CTFE, I don't get error messages from LDC or GDC (i.e. [old?] versions currently up on d.godbolt.org) x64 compilers even if I do use the =void optimisation. This saved a totally wasteful and pointless zero-fill of 64 bytes using 2 YMM instructions in the particular unit test case I had, but of course could easily be dramatically bad news depending on the array size I am unnecessarily filling.
Re: No CTFE of function
On Monday, 28 August 2017 at 03:16:24 UTC, Mike Parker wrote: On Sunday, 27 August 2017 at 17:47:54 UTC, Cecil Ward wrote: [...] The rules for CTFE are outlined in the docs [1]. What is described there is all there is to it. If those criteria are not met, the function cannot be executed at compile time. More importantly, as mentioned earlier in the thread, CTFE will only occur if a function *must* be executed at compile time, i.e. it is in a context where the result of the function is required at compile-time. An enum declaration is such a situation, a variable initialization is not. [...] Those links are extremely useful. Many thanks. Because I am full of NHS pain drugs, I am pretty confused half the time, and so finding documentation is difficult for me through the haze, so much appreciated. RTFM of course applies as always.
Re: No CTFE of function
On Sunday, 27 August 2017 at 00:08:45 UTC, Jonathan M Davis wrote: [...] Indeed. I used the term CTFE too loosely.
Re: testing for deprecation
On Saturday, 26 August 2017 at 07:17:49 UTC, user1234 wrote: getAttributes is made for UDAs only. Okay, well if you change it to deprecated { void foo(); } void main() { pragma(msg, __traits(getFunctionAttributes, foo)); } then you just get tuple(@system) so the issue still stands. I see no way to loop through members of a module at compile-time and exclude the ones that are deprecated.
Re: How do I create a fileWatcher with an onFileChange event using spawn?
On 2017-08-28 08:31, Nemanja Boric wrote: On Monday, 28 August 2017 at 06:27:20 UTC, Jacob Carlborg wrote: http://code.dlang.org/packages/vibe-core http://code.dlang.org/packages/libasync In addition, to avoid polling, it's possible to register yourself to the operating system so it will tell you when a modification on the given file has happened: https://msdn.microsoft.com/en-us/library/aa364417%28VS.85%29.aspx?f=255=-2147217396 http://man7.org/linux/man-pages/man7/inotify.7.html That's what the two libraries above provides, in a cross-platform way. -- /Jacob Carlborg
Re: How do I create a fileWatcher with an onFileChange event using spawn?
On Monday, 28 August 2017 at 06:27:20 UTC, Jacob Carlborg wrote: On 2017-08-25 23:25, Enjoys Math wrote: Something like this: module file_watcher; import std.concurrency; import std.file; import std.signals; import std.datetime; void fileWatcher(Tid tid, string filename, int loopSleep) { auto modified0 = timeLastModified(filename); while (true) { modified = timeLastModified(filename); if (modified > modified0) { modified0 = modified; //if (onFileChange !is null) // onFileChange(receiver); } sleep(dur!"msecs"(loopSleep)); } } But I'm not sure how to send the onFiledChange event. A delegate perhaps? Or you can look at any of the existing event driven libraries that do this: http://code.dlang.org/packages/vibe-core http://code.dlang.org/packages/libasync In addition, to avoid polling, it's possible to register yourself to the operating system so it will tell you when a modification on the given file has happened: https://msdn.microsoft.com/en-us/library/aa364417%28VS.85%29.aspx?f=255=-2147217396 http://man7.org/linux/man-pages/man7/inotify.7.html
Re: How do I create a fileWatcher with an onFileChange event using spawn?
On 2017-08-25 23:25, Enjoys Math wrote: Something like this: module file_watcher; import std.concurrency; import std.file; import std.signals; import std.datetime; void fileWatcher(Tid tid, string filename, int loopSleep) { auto modified0 = timeLastModified(filename); while (true) { modified = timeLastModified(filename); if (modified > modified0) { modified0 = modified; //if (onFileChange !is null) // onFileChange(receiver); } sleep(dur!"msecs"(loopSleep)); } } But I'm not sure how to send the onFiledChange event. A delegate perhaps? Or you can look at any of the existing event driven libraries that do this: http://code.dlang.org/packages/vibe-core http://code.dlang.org/packages/libasync -- /Jacob Carlborg