Re: Generic function resolution
The issue is that you only define l in g when it is called with type B. In the let sk line however you call f with type C, which calls g with the same type. Thus, the when branch isn't part of the g proc for that case. So l is never defined. Is that clear?
Re: Choosing Nim
> The experiment I worked on was BaBar at SLAC. Oh, that's nice and also a kind of a funny coincidence. That means you were involved with an experiment studying weak CP violation whereas we're now trying to study strong CP violation, heh. > Although the simulation and reconstruction and analysis code was written in > C++ book-keeping was better done in a scripting language. It was easier to do > this in Python. I _think, but you would know better, Python has moved to the > analysis area too. So, more researchers are using Python there. Yes, a lot of people use Python all over the place in physics now. Most of my ATLAS colleagues write their ROOT scripts using pyroot / rootpy instead of ROOT's C++ abomination. Which is always kind of funny when you see them talking about "running python" but encountering segmentation faults... :) > For analysis, I think Nim could have an advantage as it's faster. I think the > time to develop the code is about the same, but Nim would reduce the > execution time. It would also help to reduce the debugging time as a chunk of > time is spent keeping track of which variables are which type. Oh yes, for sure! Unfortunately I feel most physicists don't realize that dynamic typing is a burden. > I think I recall that someone wrote an interface to ROOT for Nim, so you can > read and manipulate your data in Nim. Yes? I haven't seen that. Sounds to me like creating that wrapper would be kind of a pain, given how ROOT is 1. C++ and 2. essentially provides its own standard library. > For your analysis are you using Nim? I think that would be a great thing. I > know nothing about Axions, or searches for Axions. I should look it up to > find out more. Yep, I'm writing my whole analysis in Nim. Since the axion community is still pretty tiny (although it has been growing and should grow even more now after the last European Strategy for Particle Physics update, which really endorses axion searches) I'm not really forced to use some existing analysis framework. There was some code written by my predecessor, but that was all ROOT and MarlinTPC. Threw it all away and started from scratch. Code is here: [https://github.com/Vindaar/TimepixAnalysis](https://github.com/Vindaar/TimepixAnalysis) It's one big mono repository for my whole thesis essentially though. The most interesting code is the Analysis directory. In general axion searches all deal with the same big problem: given that axions haven't been detected yet, it means their detection via some interaction is hard (-> coupling constants are tiny). What does that imply? Of course that no matter what kind of experiment one builds, all sorts of background will massively dominate everything one measures. So they are all very low rate experiments, which need the best possible background suppressions (both hardware and software wise) as possible. In that sense it's a little similar to neutrino experiments, except even worse. Also neutrino experiments of course have the benefit nowawadays of simply having a lot more manpower and money to build in better locations (e.g. waaayy below ground to shield from cosmics) than we do. My experiment - CAST - is simply sitting in a random hall at surface level at CERN. The only shielding from muons I have is ~20 cm of lead. So there's still like ~1 muon every couple of seconds in my detector. What we want to measure are X-rays which are the result of axions entering our magnet (LHC prototype dipole magnet, 9 m long, 9T magnetic field) and interacting with the virtual photons of the magnetic field. These would just be the result of X-rays interacting in the Sun and randomly converting to axions and then leave the Sun unhindered. Have a great weekend!
Re: Choosing Nim
I always like to hear about why people pick Nim! > Many years ago I was tasked with looking after a database for a particle > physics experiment. That's awesome! May I ask which experiment that was? Just curious, because I'm currently doing my PhD in Physics. The majority of my group actually works on ATLAS (both data analysis and hardware development for the HL-LHC), but I search for axions with CAST. :)
Re: gr.nim - floats in FFI
To be honest the comment by the guy who talks about the JSON representation is just... well. The numbers there are just what stringification of floats looks like. Consider the value in the middle e-16, which is just 0. Or rather supposed to be. If the JSON conversion were done by Nim, I'd say this is related to these issues: [https://github.com/nim-lang/nim/issues?q=is%3Aissue+float+round+trip](https://github.com/nim-lang/nim/issues?q=is%3Aissue+float+round+trip) But from what I understand it's done by GR internally. Either the code of the C example in your issue happens to result in "nicer" numbers (which as strings are 2 - some epsilon) or there's some stuff involved that automatically makes the floats "nice" strings, which for some reason isn't triggered in your code. In any case though, the determination of the axes sizes should allow for some epsilon above some "nice" tick value and just cut off from there, precisely for this reason. And this should (if you have to use a JSON representation internally..) not rely on the stringification of floats to produce "nice" numbers. ggplotnim still doesn't handle this either though. I've thought about it quite a few times, but so far have been too lazy to handle this correctly without throwing out information in cases where some apparent epsilon is a "real value" etc. Also it's kinda funny that the data range calculation is confused by the offset in the numbers, but handles the 0 tick label correctly. I obviously don't know how they calculate their tick values exactly, but ggplotnim uses linspace internally and then you run into this exact issue again when determining the tick values and have to make sure you don't print the 0 tick label as 1e-16 something. Sorry, this was probably not that helpful. As far as I see your code looks fine (you should use the implicit result variable, set the size of result from the start and assign via result[i] = val! :) ).
Re: Typography update - now it can render 99% the Google Fonts ttf.
I haven't had a use for this so far, but this is amazing! You have a typo in your URL there. Missing the y at the end. :)
Re: ggplotnim - pretty native plots for us
I wasn't aware of the GR framework. I certainly looks interesting. However, it does _not_ look more light weight than cairo. Just having Qt as a dependency is an immediate no-go to me. At least for a default backend (unless I'm missing something and you can easily get both binaries w/o Qt dependency and build it w/o it). Also it obviously does a lot more than cairo. It's a full fledged visualization library. For ggplotnim's purposes the only advantage it would have would be access to more backends, as far as I can see. Adding a new backend to ginger is in principle as easy as providing these procs: [https://github.com/Vindaar/ginger/blob/master/src/ginger/backendDummy.nim](https://github.com/Vindaar/ginger/blob/master/src/ginger/backendDummy.nim) And see the actual cairo backend: [https://github.com/Vindaar/ginger/blob/master/src/ginger/backendCairo.nim](https://github.com/Vindaar/ginger/blob/master/src/ginger/backendCairo.nim) So feel free to add a new GR backend to ginger if you'd like! To me the most important features I want from backends are: * png, pdf support: provided by cairo already * LaTeX handling of labels / text: will be done via a tikz backend (good to see that apparently GR is going that route for LaTeX too!) * an interactive viewer: not implemented, but can also be done via cairo. The more challenging aspect is writing the logic that allows for updates in the first place (and if possible incremental updates of the plot, but that's hard with the current implementation I think) * a Vega backend: well, has to be done by writing a Vega backend I can totally see how GR can be a great library to build a powerful visualization library, if being used from the onset. It seems to take care of a lot of annoying details I had to get right.
Re: ggplotnim - pretty native plots for us
Sorry about that. When I started writing this I had no idea cairo would be such a pain on Windows. There's an issue about it here: [https://github.com/Vindaar/ggplotnim/issues/57](https://github.com/Vindaar/ggplotnim/issues/57) I haven't updated the README yet, mostly because I don't have a good solution either yet. The easiest for me on a practical level was to just install emacs and add it to my PATH (which is I guess equivalent to you using the Inkscape libraries). I guess I can think about either adding working versions of the required libraries to the repository for windows (at least win64) or a script which clones the cairo repository and builds it locally. I haven't built cairo locally yet, so I don't know if it works well. Now regarding your actual question. If you want to ship a program, which uses ggplotnim internally, you have to do what people do on Windows as far as I know: bundle all required DLLs with the program. The other alternative would be a static build of cairo. I'll see what I can do to improve the situation. Thanks for the input!
Re: ggplotnim - pretty native plots for us
I'm happy to say that facet_wrap is finally back with version v0.3.5. Normal classification by a (in this case 2) discrete variable(s): Classification by discrete variable with free scales: See the code for these two here: [https://github.com/Vindaar/ggplotnim/blob/master/recipes.org#facet-wrap-for-simple-grid-of-subplots](https://github.com/Vindaar/ggplotnim/blob/master/recipes.org#facet-wrap-for-simple-grid-of-subplots) Other notable changes of the last few versions include: * all recipe plots are now also checked in the CI based on the JSON representation of the final Viewport, which drawn by ginger * bar plots can now show negative bars * gather on the arraymancer backend does not require all columns to be of the same type anymore * ridgeline plots were added. There's no recipe yet, because one thing still has to be fixed: the size of the top most ridge is not scaled if the content (using an overlap > 1 is used) exceeds the size of the ridge. With the changes done for the facet_wrap fix however, this is finally possible to implement. See the full changelog for all recent changes: [https://github.com/Vindaar/ggplotnim/blob/master/changelog.org](https://github.com/Vindaar/ggplotnim/blob/master/changelog.org)
Re: Iterate over fields
First of all see of course the docs here: [https://nim-lang.github.io/Nim/macros.html#quote%2Ctyped%2Cstring](https://nim-lang.github.io/Nim/macros.html#quote%2Ctyped%2Cstring) and the macro tutorial, specifically: [https://nim-lang.github.io/Nim/tut3.html#introduction-generating-code](https://nim-lang.github.io/Nim/tut3.html#introduction-generating-code) So the basic idea is that quote do allows you to write exactly the code you want to generate. However, in most cases that's not really very helpful, because if you can explicitly write your code, you could also just write a template / proc. That's where the back ticks come in to perform actual quoting of NimNodes defined in the current scope. They will be inserted in those places. quote do is thus just a nice way to avoid having to build the AST manually (as I for instance do in the newVar proc), but keep the ability to insert NimNodes you calculate / determine somehow based on what the macro is supposed to accomplish. Another thing to keep in mind when using quote do is about the stuff that's not quoted with back ticks. As a rule of thumb (someone please correct me): * any procedure / template you use within quote do will be bound in the scope where the macro code is injected * any variables you introduce will be "gensym'd", that is for each symbol you introduce a unique symbol will be created. So if you write var x = 5 within quote do, the final code won't have the variable x, but something like x_12345. The second means that if you want to refer to some variable that will be known in the scope in which the macro is used, you have to create the identifier manually and quote it. Due to the first point you fortunately don't have to do the same for procedures you want to use.
Re: Iterate over fields
> I'll look into your solution since I may need to adapt a few things (I've > simplified the real uses cases to summarize it into a single problems). The > goal is also to learn Nim's macro as well. I've now spent probably as much > time on macros than it would have took to write the solution it by hand, but > it's not as fun. If you have questions about my code there or general macro questions, just ask. I'll try to help!
Re: Iterate over fields
If all of your procs are going to look like newFooBar above there, it's possible to generate with a macro. import macros, tables type Tensor[T] = object discard Model = object field1: string field2: string field3: int FooObj = object field1: Tensor[float] field2: Table[string, float] field3: int proc unpack[T; U](arg: T, to: var U) = discard proc newVar(name, dtype: NimNode): NimNode = result = nnkVarSection.newTree( nnkIdentDefs.newTree( ident(name.toStrLit.strVal), # replace by new ident dtype, newEmptyNode() ) ) echo result.repr macro genNewObjProc(obj, model: typed): untyped = let objFields = obj.getType[1].getTypeImpl[2] # get recList of type let modelFields = obj.getType[1].getTypeImpl[2] # get recList of type doAssert objFields.len == modelFields.len var body = newStmtList() let modelIdent = ident"model" # variable to hold object constructor `FooBar(field1: field1,...)` var objConstr = nnkObjConstr.newTree(obj) for i in 0 ..< objFields.len: let modelName = ident(modelFields[i][0].toStrLit.strVal) # replace by new ident doAssert eqIdent(objFields[i][0], modelName) let objType = objFields[i][1] body.add newVar(modelName, objType) body.add quote do: unpack(`modelIdent`.`modelName`, `modelName`) # add to object constructor objConstr.add nnkExprColonExpr.newTree(modelName, modelName) # add resulting `FooObj` call let resIdent = ident"result" body.add quote do: `resIdent` = `objConstr` let procParams = [obj, # return type nnkIdentDefs.newTree(modelIdent, model, newEmptyNode())] result = newProc(name = ident("new" & obj.toStrLit.strVal), params = procParams, body = body) echo result.repr genNewObjProc(FooObj, Model) Run I'm not sure how helpful it is to get such a macro if one isn't familiar with macros. But since it's fun to write I might as well give you a solution. :)
Re: Error: got proc, but expected proc {.closure.}
The proc you want to return shouldn't have a name. So this line: result = proc differentiate(c: int): int = Run should be result = proc (c: int): int = Run
Re: Undeclared field: 'keys' (iterator call)
Can you share a more complete example on [https://play.nim-lang.org](https://play.nim-lang.org)?
Re: Undeclared field: 'keys' (iterator call)
I'm on my phone right now, so I won't try to find the correct issues. This is a problem of toSeq in combination with method UFCS. Call the keys iterator as a normal function call to toSeq and it should work.
Re: High to Low on sequence not working?
The reason it doesn't work is that N .. M in the context of a for loop implicitly calls countup. You need to explicitly call countdown: for i in countdown(high(seqStr), low(seqStr)): echo seqStr[i] Run
Re: ggplotnim - pretty native plots for us
Ok, so I just merged the arraymancer backend PR, which includes the PR for version v0.2.0. v0.2.0 was mainly ridgeline plots and scale_*_reverse. Note that due to there is currently no recipe for a ridgeline plot. That will be added in the next few days. Also they are not as nice as they should be (essentially the top ridge doesn't change its height depending on the max values in the ridge if overflowing of ridges into one another is allowed). scale_*_reverse just allows to reverse scales as the name suggests. Aside from that a few smaller things were added (theme_void) and a few recipes that use geom_tile (annotated heatmap and plotting the periodic table). I'm not entirely happy with the state of version v0.3.0 though, since the formula mechanism introduces several breaking changes. Arguably reading formulas is now clearer (see the beginning of the README and especially the recipes, since they all have to be compliant with the new mechanism!), but it still requires code to be changed. I think the amount of breakage is probably not that large, since not that many people will have used formulas for things anyways yet. Also because the DF was discouraged before, since it was slow. Simple formulas e.g. f{"hwy"} remains unchanged anyways, same as f{5} to set some constant value to an aesthetic. But for these things formulas were previously only required for numbers and not referring to columns, since the aes proc took string | FormulaNode. Now also numbers are actually supported, so to set some constant value, you can just do aes(width = 0.5) instead of aes(width = f{0.5}). In any case, I wanted to get this PR off my chest, since it was way too large. I tried to avoid breaking changes as much as possibly by macro magic, but this issue: [https://github.com/nim-lang/Nim/issues/13913](https://github.com/nim-lang/Nim/issues/13913) was the nail in the coffin. So I just release it now. Feel free to open issues, because I broke your code. :)
Re: How to write shell scripts in Nim
Aside from putting that shebang line at the top of the file you want to run as a script, the file has to be saved as a NimScript file, namely use a .nims file ending. someScript.nims #!/usr/bin/env nim echo "Hello from NimScript!" echo defined(NimScript) Run and then in your terminal: chmod +x someScript.nims ./someScript.nims Run and it should run just fine.
Re: ggplotnim - pretty native plots for us
Thanks, maybe I'll give it a try to include it manually into the repo! > improve performance and usability on complex apply/map It will definitely help, but I'm already creating a single loop for each formula, no matter how many tensors are involved. E.g. let df = ...# some DF w/ cols A, B, C, D df.mutate(f{"Foo" ~ `A` * `B` - `C` / `D`}) Run will already be rewritten to: var col0_47816020 = toTensor(df["A"], float) col1_47816021 = toTensor(df["B"], float) col2_47816022 = toTensor(df["C"], float) col3_47816023 = toTensor(df["D"], float) res_47816024 = newTensor[float](df.len) for idx in 0 ..< df.len: []=(res_47816024, idx, col0_47816020[idx] * col1_47816021[idx] - col2_47816022[idx] / col3_47816023[idx]) result = toColumn res_47816024) Run which is indeed a little slower than a manual map_inline, but still pretty fast. Compare the first plot from here: [https://github.com/Vindaar/ggplotnim/tree/arraymancerBackend/benchmarks/pandas_compare](https://github.com/Vindaar/ggplotnim/tree/arraymancerBackend/benchmarks/pandas_compare) Not sure where the variations map_line sees are coming from though. Effects of openmp? **Small aside about the types** The data types are determined as floats from the usage of *, / etc. Could be overridden by giving type hints: f{int -> float: ...} ^--- type of involved tensors ^ type of resulting tensor Run > AFAIK it should would allow combining complex transformations and do them in > a single pass instead of allocating many intermediate dataframes so > performance can be an order of magnitude faster on zip/map/filter chains. While this is certainly exciting to think about, I think it'd be pretty hard to (for me in the near future anyways) achieve while: 1. keeping it simple to extend the library by adding new procs 2. still allowing usage of the procs in a normal way as to return a new DF (without having differently named procs for inplace / not inplace variants). But this is just me speculating from the not all that simple code of zero-functional. I guess having a custom operator like it does would allow us to replace the user given proc names though. If you have a better idea of how to do efficient chaining that seems reasonable to implement, I'm all ears. **what I 'm working on** Right now I'm rather worrying about having decent performance for group_by and inner_join though. I'm looking at [https://h2oai.github.io/db-benchmark](https://h2oai.github.io/db-benchmark)/ since yesterday. It's a rather brutal reality check, hehe. Comparing my current code with the first of the 0.5 GB group_by examples to pandas and data.table was eye opening. In my current implementation of summarize for grouped data frames I actually return the sub data frames for each group and apply a simple reduce operation based on the users formula. Well, what a surprise, that's slow. I haven't dug deep into data.table of pandas yet, but as far as I can tell they essentially special case group_by \+ other operation and handle these by just aggregating on all groups in a single pass. So I've implemented the same and even for a single key with a single sum I'm 2 times slower than running the code with pandas on my machine. To be fair, performing operations on sub groups individually is a nice 100x slower than pandas. Still, the biggest performance impact I have to make is in order to allow columns with multiple data types to group by. I need some way to check which subgroup a row belongs to. Since I can't create a tuple at runtime, in order to just use normal comparison operators I decided to calculate a hash for each row and compare that. That works well, but gives me that 2x speed penalty. For the time being though, I think I'm happy with that unless I have a better idea / someone can point me to something that works in a typed language and doesn't involve huge amount of boilerplate code. So I'm currently working on an implementation that allows to use user defined formulas for aggregation while not having to call a closure for each row.
Re: Template - how to prefix a function's name
Oh yes, I could have made that more clear. Indeed, the default type is untyped. Both for the arguments as well as the return type. And yes, untyped is required to make this work. Essentially untyped is just considered as a raw Nim identifier (nnkIdent in macro terms). If you used string as an argument, the compiler would understand that as a string at runtime. Since the name of the generated proc / etc. has to be known at compile time of course, this wouldn't work. You _can_ (although I don't think with a template) hand a static string, which is a string known at compile time and extract an identifier from the string. But unless you do more complicated macro things where you might want to calculate names of procs you want to generate, this won't be much different than just handing a raw identifier. An example: import macros macro genproc(prefix: static string): untyped = let procName = ident(prefix & "World") result = quote do: proc `procName`() = echo "Hello world" genProc("hello") # works, string literal is static helloWorld() const foo = "alsoHello" genProc(foo) # a const string is known at CT alsoHelloWorld() # and also proc getName(): string = result = "finalHello" var bar {.compileTime.}: string static: bar = getName() # some CT computation genProc(bar) finalHelloWorld() Run
Re: ggplotnim - pretty native plots for us
Some simple benchmarks comparing the new backend to pandas at: [https://github.com/Vindaar/ggplotnim/tree/arraymancerBackend/benchmarks/pandas_compare](https://github.com/Vindaar/ggplotnim/tree/arraymancerBackend/benchmarks/pandas_compare) Note that I ran the code on a default pandas installation on my void linux, without blas. But I also compiled the Nim code without blas support. It's just a port of a pandas / numpy comparison from here: [https://github.com/mm-mansour/Fast-Pandas](https://github.com/mm-mansour/Fast-Pandas) All in all the new backend (let's call it datamancer from now on, heh) is significantly faster for all operations, which essentially just rely on @mratim's work. For a few others, specifically unique and sorting, it's slightly slower. But given the implementation of those I'm actually rather happy with that. And especially for small data frame sizes the function call / looping overhead python has to bear is ridiculous. I'll focus on finishing up the open PR (ridgelines and a bit more) and then finish this.
Re: ggplotnim - pretty native plots for us
I've started to implement the arraymancer backend into the actual code of ggplotnim now. Most recipes are compiling and working fine now. The rules for formula creation have changed a little bit, but actually provide more control now. There's no documentation about the rules until everything works, except: [https://github.com/Vindaar/ggplotnim/blob/arraymancerBackend/playground/arraymancer_backend.nim#L946-L958](https://github.com/Vindaar/ggplotnim/blob/arraymancerBackend/playground/arraymancer_backend.nim#L946-L958) and the modified recipes. The code currently uses the arraymancer backend by default (-d:defaultBackend to use the old one; yeah irony).
Re: ggplotnim - pretty native plots for us
Super short update: I essentially reached feature parity of the arraymancer backend DF right now. Still WIP, but the implementation currently lives on the arraymancerBackend branch in the playground dir here: [https://github.com/Vindaar/ggplotnim/blob/arraymancerBackend/playground](https://github.com/Vindaar/ggplotnim/blob/arraymancerBackend/playground) One of the worst offenders of performance before was gather if many columns were involved. An example from this issue: [https://github.com/Vindaar/ggplotnim/issues/39](https://github.com/Vindaar/ggplotnim/issues/39) is down from 12.5 s to only 0.05 s for only the gather call. Progress! :)
Re: ggplotnim - pretty native plots for us
So I did a thing today… (which is why I haven't answered yet). This morning I took another look at a rewrite of the `DataFrame` using an arraymancer backend. Turns out by rethinking a bunch of things and especially the current implementation of the `FormulaNode`, I managed to come up with a seemingly working solution. This is super WIP and I've only implemented `mutate`, `transmute` and `select` so far, but first results are promising. Essentially the `FormulaNode` from before is now compiled into a closure, which returns a full column. So the following formula: f{"xSquared" ~ "x" * "x"} Run will assume that each string is a column of a data frame and create the following closure: proc(df: DataFrame): Column = var colx_47075074 = toTensor(df["x"], float) colx_47075075 = toTensor(df["x"], float) res_47075076 = newTensor[float](df.len) for idx in 0 ..< df.len: []=(res_47075076, idx, colx_47075075[idx] * colx_47075074[idx]) result = toColumn res_47075076 Run The data types for the columns and the result data type are currently based on heuristics given things that appear in the formula. E.g. if math operators appear it's float, if boolean operators it's bool etc. The data frame now looks like: DataFrame* = object len*: int data*: Table[string, Column] case kind: DataFrameKind of dfGrouped: # a grouped data frame stores the keys of the groups and maps them to # a set of the categories groupMap: OrderedTable[string, HashSet[Value]] else: discard Run where a `Column` is: Column* = object case kind*: ColKind of colFloat: fCol*: Tensor[float] of colInt: iCol*: Tensor[int] of colBool: bCol*: Tensor[bool] of colString: sCol*: Tensor[string] of colObject: oCol*: Tensor[Value] Run `colObject` is the fallback for columns, which contain more than one data type. So I only wrote a super simple for loop to get a rough idea how fast/slow this might be: import arraymancer_backend import seqmath, sequtils, times #import ggplotnim # for comparison with current implementation proc main(df: DataFrame, num: int) = let t0 = cpuTime() for i in 0 ..< num: df = df.mutate(f{"xSquared" ~ "x" * "x"}) let t1 = cpuTime() echo "Took ", t1 - t0, " for ", num, " iter" proc rawTensor(df: DataFrame, num: int) = var t = newTensor[float](df.len) let xT = df["x"].toTensor(float) let t0 = cpuTime() for i in 0 ..< num: for j in 0 ..< df.len: t[j] = xT[j] * xT[j] let t1 = cpuTime() echo "Took ", t1 - t0, " for ", num, " iter" when isMainModule: const num = 1_000_000 let x = linspace(0.0, 2.0, 1000) let y = x.mapIt(0.12 + it * it * 0.3 + 2.2 * it * it * it) var df = seqsToDf(x, y) main(df) rawTensor(df) Run Gives us: new DF: * `Took 9.570060132 for 100 iter` raw arraymancer tensor: * `Took 1.034196647 for 100 iter` (so still some crazy overhead!) While the old DF took 23.3 seconds for only 100,000 iterations! So about a factor 23 slower than the new code. Probably really bad comparison with pandas: import numpy as np import pandas as pd x = np.linspace(0.0, 2.0, 1000) y = (0.12 + x * x * 0.3 + 2.2 * x * x * x) df = pd.DataFrame({"x" : x, "y" : y}) def call(): t0 = time.time() num = 10 for i in range(num): df.assign(xSquared = df["x"] * df["x"]) t1 = time.time() print("Took ", (t1 - t0), " for 1,000,000 iterations") call() Run `Took 60.24467134475708 for 100,000 iterations` I suppose using assign and accessing the columns like this is probably super inefficient in pandas? And a (also not very good) comparison with `NimData` import nimdata import seqmath, sequtils, times, sugar proc main = let x = linspace(0.0, 2.0, 1000) let y = x.mapIt(0.12 + it * it * 0.3 + 2.2 * it * it * it) var df = DF.fromSeq(zip(x, y)) df.take(5).show() echo df.count() const num = 1_000_000 let t0 = cpuTime() for i in 0 ..< num: df = df.map(x => (x[0], x[0] * x[0])).cache() let t1 = cpuTime() echo "Took ", t1 - t0, " for ", num, " iter" when isMainModule: main() Run `Took 16.322826325 for 1,000,000 iter` I'm definitely not saying the new code is faster than NimData or pandas, but it's defintely promising! I'll see where this takes me. I think though I managed to implement the main things I was worried about. The rest should just be tedious work. Will keep you all posted.
Re: Template - how to prefix a function's name
You do it like this: template mytempl(prefix) = proc `prefix World`() = echo "hello world" Run See here: [https://nim-lang.github.io/Nim/manual.html#templates-identifier-construction](https://nim-lang.github.io/Nim/manual.html#templates-identifier-construction)
Re: ggplotnim - pretty native plots for us
@spip I'll answer your question below aswell. > Is this compatible with other libraries, such as arraymancer, etc? I think > that one of the biggest strengths of the python numerical ecosystem is the > good inter-operability of most plotting libraries with numpy. So if that is > not already the case I would suggest making that your highest priority. The answer to that is "sort of". I'll need to explain a little to answer the why and what I mean by "sort of". **The long answer** Originally when I started the library I never planned to write a data frame library to go with this. I quickly realized however that (at least with a library like `ggplot2`) one doesn't work well without the other. In a normal plotting library every plotting function is a special case. Essentially each kind of plot wants data in a specific form / of a specific data type. So in the beginning I specifically didn't want to use arraymancer internally. I love that library, but given that all I wanted to write was a "plotting library", this meant two things specifically for me: * The library is essentially a sink for the user's data. It doesn't return anything, so there's no reason for the internal data type to conform to any standards * If a user wants to create a plot, performance will **not** be an important consideration (which does not imply performance of a plotting library does not matter!). Creating a plot will always be slow (compared to pure number crunching anyways). There are use cases for libraries, which can create plots at several hundred fps. But to be honest, if I need to create a huge number of plots and am thus performance sensitive, the question is if a plotting library is the best tool in the first place. For this reason I decided to avoid having arraymancer as a dependency, because all its strengths are mostly useless for the intended purpose, but would mean I introduce an unnecessary dependency. If a user is using arraymancer for calculations, it's easy to convert the required data to ggplotnim's data types. I felt the overhead of copying the data was not a big deal under the assumption mentioned above. **But** , things did somewhat change when I started to write the data frame. My first idea was actually to use `NimData`, since I really like library. However, the (depending on viewpoint) advantage / disadvantage that their type is entirely defined via a schema at compile time, didn't appeal to me. I didn't want to end up with a ggplot2 clone that was super restrictive, because everything had to be known at compile time. I was actually hoping that @bluenote would pick up his development of Kadro again: [https://github.com/bluenote10/kadro](https://github.com/bluenote10/kadro) That sounded perfectly suited. But since he didn't, I simply started to hack together something that suits the needs of the library. Originally in fact the `DataFrame` type was generic and my goal was to write the code in such a way that the underlying type does not matter. This made things complicated though. In fact I even thought about an arraymancer backend from the start: [https://github.com/Vindaar/ggplotnim/blob/master/playground/arraymancer_backend.nim](https://github.com/Vindaar/ggplotnim/blob/master/playground/arraymancer_backend.nim) which however never progressed from there. Mainly because I couldn't figure out how to make use of arraymancer's performance, when majority of data frame operations I did ended up copying around data. Which is how I ended up with @PMunch's persistent vector from Clojure. It kind of allowed me to "copy as much as I want" without the performance penalty. This is how we got to the current situation. The data frame is okayish fast for simple things to prepare a plot. Anything else, I can't recommend it (also because it's extremely lenient on types!). **tl;dr** Compatibility with the "rest of the ecosystem" isn't there for practical reasons. The thing is I'd love to profit from @mratsim's amazing work on arraymancer and laser! Once I go back and reconsider performance of the data frame, I hope I will end up using as much of arraymancer as I can to be honest. I just need to figure out how to do it. :) > Other than that, I didn't see mention of support for contour plots in the > docs. It is surprising how often those come in handy in many scenarios so I'd > like for you to add that if it is not available yet. Another thing I like to > do is to combine line plots with histograms and/or kernel density plots on > the X and Y axis (to get a quick idea of the distribution of the values, > particularly in time series). It would be neat to support for that too. Good point. Contour plots are something I simply didn't think about. I've never actually thought about how those are implemented before. I guess it's just a 2 dimensional KDE, right? Since I will
ggplotnim - pretty native plots for us
Hey! As many of you will be aware by now, I started to write a port of [ggplot2](https://ggplot2.tidyverse.org/) some time mid last year: [https://github.com/Vindaar/ggplotnim](https://github.com/Vindaar/ggplotnim) After many sometimes frantic sessions working on this, I'm finally approaching a first personal milestone: Essentially all features I consider essential for a plotting library (for my personal use cases!) are (or are about to be) implemented. This will mark the release of version `v0.3.0`. The remaining features I will implement in the next few days are: * `geom_density`: to create smooth density estimates of continuous variables using kernel density estimation (KDE). I've implemented a naive KDE with complexity `O(m x n)` for testing and it works very well (but it's very slow obviously). I want to improve that before merging it. If anyone has a good resource for a simple to implement but reasonably performant KDE implementation / algorithm, feel free to post it! * `geom_ridgeline`: ridgeline plots (or joyplots) are fun and pretty! Should be straightforward to implement. * re-activate `facet_wrap`: `facet_wrap` has been dormant for a few months now, because an internal rewrite broke them at some point. The implementation is there, but I need to fix the layouting, which is even more broken now than before. But that should also be fairly easy. Now, the main reason I open this topic is to ask all of you about what I should focus on once the above is done. # Possible things to work on There are several ideas I have in my mind, but definitely not the time to tackle them at the same time. They are: ## properly implement the Vega-Lite backend One of the main goals I had in mind when starting this whole project was to provide two different plotting backends. One native target to produce plots locally, fast and statically. On the other hand, originally inspired by @mratsim's [monocle](https://github.com/numforge/monocle), a [Vega-Lite](https://vega.github.io/vega-lite/) backend to scratch that interactive / web based itch, which allows for easy sharing of plots **including data**! I wrote a [proof of concept](https://github.com/Vindaar/ggplotnim#experimental-vega-lite-backend) and by now I have a pretty good idea (barring a lack of Vega experience) on how to implement this. Essentially the whole processing of the plot as is done now remains the same. This allows to make use of the whole functionality of `ggplotnim` without having to do a lot of duplication. The drawing code will be replaced by a mapping to JSON instead. The major work would be involved defining said mapping. If I'm lucky I can even write it as a [ginger backend](https://github.com/Vindaar/ginger/blob/master/src/ginger/backendCairo.nim) with a - for Vega pretty obscure - API (`drawPoint`, `drawLine`, etc. essentially just adding data to a `JsonNode`). More likely it'll involve replacing the [drawing portion](https://github.com/Vindaar/ggplotnim/blob/master/src/ggplotnim/ggplot_drawing.nim#L342-L364) of `ggplotnim` Vega related drawing equivalents. ## improve `DataFrame` performance The included data frame in `ggplotnim` is - for many operations anyways - abysmally slow. While performance is nice, I mainly wanted something to work with "right now" instead of spending a lot of time writing a performant data frame. The reasons for the performance are three-fold, as far as I can tell: * for some operations the algorithms used are inefficient * the underlying data type is a `Value` similar to a `JsonNode`. Conversion to and from normal types is slow and operations on `Value` are also slow, since there are always case statements involved and at least one indirection to access the actual value. * each column is a `PersistentVector[Value]`. For most operations this is a major performance boost over a `seq[Value]`, since we avoid a large amounts of copying. However, iterating over long vectors or building long vectors is slow. One thing to improve performance would be to include the distinction between a pure column of one data type and `Value` columns (which are somewhat similar to `object` types in numpy / pandas if my superficial understanding of those is correct). While I'm not certain, I believe that distinction alone would make the code a lot more complex and would definitely require a lot of use of generics. Generics is something I specifically wanted to avoid in context of a data frame, because each time I played around with toy data frames this became a headache. The only idea to avoid generics would be to extend a `Value` to also have a case for vector like data, similar to `JsonNodes` `JArray`. That would double the number of fields though. In any case, if I were to seriously attempt to improve performance of the data frames, I would stop messing around myself and first do some research into how data frames are handled elsewhe
Re: Performance test agains Python
As far as I know such simple string manipulations are actually pretty fast in python. So don't expect an amazing speed improvement over python if your code is this simple. In more "real world" examples you'll see Nim outperforming Python.
Re: Performance test agains Python
You have to be aware that using strings this way is always going to be somewhat inefficient, since each replace call will make a copy! Of those especially in the following: f2.writeLine(sLine.replace(sFind, sReplaced.replace("\"", ""))) Run the sReplaced.replace("", "") seems unnecessary. Why not perform the replacement when defining sReplaced above? Since it doesn't seem to depend on the current line, it's going to be the same either way. Also, as far as I can tell, the whole find seems unnecessary too. If replace cannot find the string sFind no replacement will take place. So you can just replace: if sLine.find(sFind) > -1: f2.writeLine(sLine.replace(sFind, sReplaced.replace("\"", ""))) else: f2.writeLine(sLine) Run by f2.writeLine(sLine.replace(sFind, sReplaced)) # with `sReplaced` changed as above Run Especially given that the substring seems to be found in 1/4 of the cases, I imagine this should be faster. The little overhead of replace over find shouldn't matter in that case. To be fair, both things also apply to the Python code.
Re: Arraymancer and --gc:arc
As far as I'm aware these two things aren't possible with Arraymancer at the moment. But using normal a seq[T] there's ways to do both: For interpolation: * numericalnim by @hugogranstrom: [https://github.com/HugoGranstrom/numericalnim#natural-cubic-splines](https://github.com/HugoGranstrom/numericalnim#natural-cubic-splines) * to an extent seqmath by @jlp765 (and me to an extent, but not on interpolation): [https://github.com/Vindaar/seqmath/blob/master/src/seqmath/smath.nim#L639](https://github.com/Vindaar/seqmath/blob/master/src/seqmath/smath.nim#L639) and below And for FFT the only way I'm aware of at the moment is via the C library kiss FFT: [https://github.com/m13253/nim-kissfft](https://github.com/m13253/nim-kissfft) I wrote down a couple of notes (mainly for myself and @kaushalmodi) about using kiss FFT from Nim a couple of days ago: [https://gist.github.com/Vindaar/fc158afbc75627260aed90264398e473](https://gist.github.com/Vindaar/fc158afbc75627260aed90264398e473)
Re: TimeFormatParseError using period character '.' as date separator
That's because there's only a few characters, which can be used directly in a format string. To use other characters / longer strings, you have to put them between ' ', like so: echo dt.format("'.'mm'.'dd") Run See below the table here: [https://nim-lang.github.io/Nim/times.html#parsing-and-formatting-dates](https://nim-lang.github.io/Nim/times.html#parsing-and-formatting-dates)
Re: Strange Macro Behavior
I don't understand what you actually want to accomplish, but I'm pretty sure you don't actually want a generic macro. But your "would love to do that" idea actually works, with a small modification. Just don't use explicit types, but use varargs[typed]. Then you can extract the type information in the macro. As long as you don't want to do crazy things with types in macros, it all works well. import macros, sugar type Generator[T] = () -> T proc gen1(): float = result = 42.0 proc gen2(): int = result = 66 proc gen3(): string = result = "Hello" # works fine if you use `varargs[typed]` macro genTuple(args: varargs[typed]): untyped = var types: seq[NimNode] var impls: seq[NimNode] for ch in args: types.add ch.getTypeImpl impls.add ch.getImpl echo types.repr echo impls.repr genTuple(gen1, gen2, gen3) Run where I extracted both the actual implementation (if you wanted to do something to those procs / their bodies) and their types. From there you can do whatever you want with those types.
Re: Compile time FFI
I essentially had the same use case as @PMunch in the past. When I thought about implementing reading Keras stored NNs in Arraymancer, one of the problems was that 1. creating a NN in Arraymancer currently means using the DSL to design the network layout. Since that DSL generates a whole bunch of procs etc. doing that at runtime is problematic. 2. Keras networks are stored in HDF5 files and the network layout in attributes of some groups. So for the most straight forward way to implement this, I wanted to read the attributes of Keras HDF5 files at compile time. Given that HDF5 is a rather complicated file format, implementing my Nim based parser (even if only for attributes) wasn't really in the cards. There's certainly solutions to this without requiring compile time FFI of course.
Re: Is "danger" define supposed to also define "release"?
Oh wow, I somehow thought it was intended behavior, since @mratsim advocated to compile with -d:release -d:danger almost immediately after -d:danger was introduced. So I thought you were aware of this.
Re: Help with set
The error message for the second case isn't as clear as it could be, if you don't know what to look for maybe. It says: Error: type mismatch: got but expected 'CharSet = set[int16]' Run See the (int) after the range ...? That tells you the type it gets is actually of type int. Your set however takes int16. So to make it work you have to give explicit int16 literals: x = {1'i16..9'i16, 15'i16, 45'i16..78'i16} Run
Re: Getting fields of an object type at compile time?
You might want something like this: import macros type Foo = object a: int b: string macro getFields(x: typed): untyped = let impl = getImpl(x) let objFields = impl[2][2] # 2 is nnkObjectTy, 2 2 is nnkRecList expectKind objFields, nnkRecList result = nnkBracket.newTree() for f in objFields: expectKind f, nnkIdentDefs result.add nnkPar.newTree(newLit f[0].strVal, newLit f[1].strVal) for (fieldName, typeName) in getFields(Foo): echo "Field: ", fieldName, " with type name ", typeName Run It returns a bracket of (field name, type name) tuples. Both as strings, since you can't mix strings with types in a tuple. For more complicated objects you'd have to recurse on the fields with complex types of course.
Re: Advent of Nim 2019 megathread
I should participate again I guess. I fear I'll have even less time than last year though. We'll see!
Re: Empty sequence of specific type given problems when compiling with "cpp"
Ah yes, indeed. If DICT is just a proc that takes a Context, this also works btw: let ret = DICT(@[]) Run The compiler can deduce the type of the empty seq itself from the args of proc DICT(c: Context). Although personally if the use case of an empty seq as the argument comes up more often, I'd set the empty sequence as the default value for the argument proc DICT(s: Context = @[]) instead. And proc newContext(len: int): Context helper is probably also useful to emphasize the intention.
Re: Empty sequence of specific type given problems when compiling with "cpp"
If I'm not missing something, you should call newSeq instead: var ret = DICT(newSeq[Context]()) Run But of course, any Nim code you write should either error during nim compilation or compile successfully.
Re: Where is "taint mode" documented?
That part of the manual is now in the experimental document here: [https://nim-lang.github.io/Nim/manual_experimental.html#taint-mode](https://nim-lang.github.io/Nim/manual_experimental.html#taint-mode)
Re: Confused about how to use ``inputStream`` with ``osproc.startProcess``
Sorry, I didn't see the post before. You're almost there. There's two small things missing in your code. 1. you should add a newline to the string you write to the input stream. cat wants that newline 2. instead of closing the input stream, you flush it. Note however that with cat at least the output stream will never be "done". So you need some stopping criteria. import osproc, streams proc cat(strs: seq[string]): string = var command = "cat" var p = startProcess(command, options = {poUsePath}) var inp = inputStream(p) var outp = outputStream(p) for str in strs: # append a new line! inp.write(str & "\n") # make sure to flush the stream! inp.flush() var line = "" var happy = 0 while p.running: if happy == strs.len: # with `cat` there will always theoretically be data coming from the stream, so # we have some artificial stopping criterium (maybe just Ctrl-C) break elif outp.readLine(line): result.add(line & "\n") inc happy close(p) echo cat(@["hello", "world"]) Run
Re: Web applications and pattern match
Good to hear! I couldn't find the source for the new version of the live demo though. In the markdown document it's still the old code as far as I can tell. Yes, please just ask!
Re: Problems with Emacs mode for Nim
Be that as it may, the fact remains that nim-mode is written in elisp. ;) I don't even think these changes are hard at all, but I don't know my way around how to find the code responsible for those indentations without studying all of nim-mode first.
Re: Web applications and pattern match
So originally I wanted to write a up a nice example to do the replacements via the scanf macro: [https://nim-lang.github.io/Nim/strscans.html](https://nim-lang.github.io/Nim/strscans.html) by defining tuples of strings to match against and their replacements, but I hit a dead end, because an element of a const tuple doesn't count as a static string for the pattern. Also scanf turned out to be more problematic than I thought, because the $* term does not like to match any string until the end. But since your book is (at least partly) about Nim macros and writing macros is fun, I built the following even longer version of your code, haha. It also includes a custom matcher that matches anything until the end of the string. # File: web.nim import strutils, os, strscans, macros let input = open("rage.md") let form = """ """ echo "Content-type: text/html\n\n" echo """ """ echo "" proc rest(input: string; match: var string, start: int): int = ## matches until the end of the string match = input[start .. input.high] # result is either 1 (string is empty) or the number of found chars result = max(1, input.len - start) macro match(args, line: typed): untyped = ## match the `args` via `scanf` in `line`. `args` must be a `[]` of ## `(scanf string matcher, replacement string)` tuples, where the latter ## has to include a single `$#` to indicate the position of the replacement. ## The order of the `args` is important, since an if statement is built. let argImpl = args.getImpl expectKind argImpl, nnkBracket result = newStmtList() let matched = genSym(nskVar, "matched") result.add quote do: var `matched`: string var ifStmt = nnkIfStmt.newTree() for el in argImpl: expectKind el, nnkTupleConstr let toMatch = el[0] let toReplace = el[1] let ifBody = nnkStmtList.newTree(nnkCall.newTree(ident"echo", nnkCall.newTree(ident"%", toReplace, matched)), nnkAsgn.newTree(matched, newLit(""))) let ifCond = nnkCall.newTree(ident"scanf", line, toMatch, matched) ifStmt.add nnkElifBranch.newTree(ifCond, ifBody) result.add ifStmt echo result.repr const h1title = ("# ${rest}", "$#") const h2title = ("## ${rest}", "$#") const elseLine = ("${rest}", "$#") const replacements = [h1title, h2title, elseLine] for line in input.lines: match(replacements, line) # produces: # var matched: string # if scanf("# ${rest}", line, matched): # echo h1title[1] % matched # if scanf("## ${rest}", line, matched): # echo h2title[1] % matched # if scanf("${rest}", line, matched): # echo elseLine[1] % matched echo form let qs = getEnv("QUERY_STRING", "none").split({'+'}).join(" ") if qs != "none" and qs.len > 0: let output = open("visitors.txt", fmAppend) write(output, qs&"\n") output.close let inputVisitors= open("visitors.txt") for line in inputVisitors.lines: match(replacements, line) inputVisitors.close echo "" input.close Run This is totally not practicle I'd say and one's better off writing something by hand or using the excellent [https://github.com/zevv/npeg](https://github.com/zevv/npeg) by @zevv. Still fun though. And if someone wants to improve on this... Finally, to just remove a prefix of a string, you may just use removePrefix from strutils: [https://nim-lang.github.io/Nim/strutils.html#removePrefix%2Cstring%2Cstring](https://nim-lang.github.io/Nim/strutils.html#removePrefix%2Cstring%2Cstring) Note that it only works inplace on string. You could use the new outplace though: [https://github.com/nim-lang/Nim/pull/12599](https://github.com/nim-lang/Nim/pull/12599)
Re: Problems with Emacs mode for Nim
You're right that there's a couple of examples where the indentation of nim-mode is all over the place and this is one of them. I've been meaning to take a look at this too, but lack of time and not being that experienced with elisp means I haven't done so. Something like this is another: proc someProc(binWidth = 0.0, breaks: seq[float] = @[], binPosition = "none" # <- tab in this line will put it # binPosition = "none", # <- here ): ReturnVal = Run The reason in your specific case of course is the tuple unpacking. It seems the opening parens is confusing nim-mode. In my case it's the default @[] for the breaks. Remove that and it works. As far as I'm aware @krux02 did most of the recent development on nim-mode. Also @kaushalmodi comes to mind as someone who could probably fix this easily.
Re: Marshal and friends
> For JSON to() macro we have > > > Heterogeneous arrays are not supported. > > I have never seen that term in Nim world before ??? That just refers to the fact that in JSON you can of course have a heterogeneous array, like: let heterogeneous = %* [1, 2.5, "Hello"] Run and these simply cannot be mapped to Nim types properly.
Re: A taxonomy of Nim packages
I just opened an issue on the awesome-nim repo about adding a few more collaborators, so that PRs can be merged more quickly. [https://github.com/VPashkov/awesome-nim/issues/65](https://github.com/VPashkov/awesome-nim/issues/65)
Re: What is the difference between "writeFile" and "newFileStream" and "write"?
The answer is simply: writeFile sure can write those and it shouldn't break it. I do essentially the same here, the only difference is where the image comes from: [https://github.com/brentp/nim-plotly/blob/master/src/plotly/image_retrieve.nim#L119](https://github.com/brentp/nim-plotly/blob/master/src/plotly/image_retrieve.nim#L119) And I just ran your first code and it works fine on my end.
Re: Nim for Statistics
May I ask what the main features are you'd require the ecosystem to provide for you to consider Nim? I'm asking since everyone's use cases are different and in my opinion it's important to know what people actually want and need. For instance for [ggplotnim](https://github.com/vindaar/ggplotnim) I know which features are important to me so that's how I choose what to work on. But if I knew there's people who consider the stats aspects of ggplot2 to be more important than geom_whatever than I would consider working on those instead.
Re: Nim for Statistics
To add to the very good answers so far, I'd mention that there is an issue which tracks scientific libraries here: [https://github.com/nim-lang/needed-libraries/issues/77](https://github.com/nim-lang/needed-libraries/issues/77) And to answer your explicit question whether Nim is _suitable_ for statistics, I'd answer with a definitive YES. But of course being suitable does not mean most libraries you'd like to use exist, just that in my opinion it's a perfect language to write / port those libraries in / to. Aside from that I'm personally not a fan of e.g. jupyter notebooks anyways. And given the quick compile times I don't feel the need. I rather like to go the literate programming path, like e.g. here: [https://github.com/Vindaar/TimepixAnalysis/tree/refactorRawManipulation/Doc/other](https://github.com/Vindaar/TimepixAnalysis/tree/refactorRawManipulation/Doc/other)
Re: Retrieving field names of an enumeration or other types?
If I understand you correctly, here you go: import macros type Foo = enum foo = "Foo" bar = "Bar" more = "More" macro enumFields(n: typed): untyped = let impl = getType(n) expectKind impl[1], nnkEnumTy result = nnkBracket.newTree() for f in impl[1]: case f.kind of nnkSym, nnkIdent: result.add newLit(f.strVal) else: discard for f in enumFields(Foo): echo f Run
Re: Requesting examples of macros in Nim
While I'm not sure what kind of features the times -> j syntax should allow (or if times and -> are fixed), the simplest implementation for the second usage I can come up with: import macros, strutils, os macro theMagicWord(statments: untyped): untyped = result = statments for st in statments: for node in st: if node.kind == nnkStrLit: node.strVal = node.strVal & ", Please." proc parseArgs(cmd: NimNode): (NimNode, NimNode) = doAssert cmd.len == 2 expectKind(cmd[1], nnkInfix) expectKind(cmd[1][0], nnkIdent) expectKind(cmd[1][1], nnkIdent) expectKind(cmd[1][2], nnkIdent) doAssert cmd[1][0].strVal == "->" doAssert cmd[1][1].strVal == "times" result = (cmd[0], # leave cmd[0] as is, has to be valid integer expr cmd[1][2]) # identifier to use for loop macro rpt(cmd: untyped, stmts: untyped): untyped = expectKind(cmd, nnkCommand) expectKind(stmts, nnkStmtList) let (toIdx, iterVar) = parseArgs(cmd) result = quote do: for `iterVar` in 1..`toIdx`: `stmts` echo result.repr # old macro #rpt j, paramStr(1).parseInt : # theMagicWord: #echo j, "- Give me some bear" #echo "Now" rpt paramStr(1).parseInt times -> j: theMagicWord: echo j, "- Give me some bear" echo "Now" Run
Re: netcdf for nim
Up to right now I didn't even know of grib2 files. I need to read up on what kind of file format that is first. I suppose it's based on HDF5 files, too? If it is, I should think we should be able to make it work. Regarding other nimhdf5 users: I'd love to be corrected, but as far as I'm aware I'm the only actual user of HDF5 files in Nim land so far. Or the other people using nimhdf5 are so happy with it they don't raise any issues. Given the state of basically non existent documentation and extremely sparse examples (sorry about that :/), that'd be a surprise.
Re: Error: expression has no type (or is ambiguous)
The compiler complains to you, because the you assign the cleanXmi call to newNode. Thus cleanXmi has to return a type.
Re: Nim beginners tutorial
I'll check it on my kindle tonight. Ping me on gitter if you don't hear from me until tomorrow.
Re: How to use file system watcher (fsmonitor) in Nim?
Apparently there's now a wrapper for libfswatch: [https://github.com/FedericoCeratto/nim-fswatch](https://github.com/FedericoCeratto/nim-fswatch) While no help to you, maybe it's interesting for someone else reading this. I ported over fsmonitor to modern asyncdispatch in February, since I quickly had to hack together an online event display. But it only supports linux, same as the old fsmonitor. [https://github.com/Vindaar/fsmonitor2](https://github.com/Vindaar/fsmonitor2) However, looking at the File System Event API for OSX, adding support doesn't seem all that complicated: [https://developer.apple.com/library/archive/documentation/Darwin/Conceptual/FSEvents_ProgGuide/UsingtheFSEventsFramework/UsingtheFSEventsFramework.html](https://developer.apple.com/library/archive/documentation/Darwin/Conceptual/FSEvents_ProgGuide/UsingtheFSEventsFramework/UsingtheFSEventsFramework.html) I don't have a Mac, so attempting that would be a pain. Sounds like a fun weekend project for someone to attempt though. :)
Re: Need debugging help
I wasn't sure whether you actually fixed your code with that post now, but I was already looking at your code when I saw the post, so I continued. I fixed the code differently, by just removing the c types you used. I trust that the test case works, because I'm not sure if I broke something. :) [https://github.com/pb-cdunn/nim-help/pull/2](https://github.com/pb-cdunn/nim-help/pull/2)
Re: netcdf for nim
Hey! As far as I'm aware there are no bindings to the NetCDF library so far. I personally don't have any experience working with NetCDF. However, I'm aware that since NetCDF4, it's actually just based on HDF5. So depending on your use cases it _might_ be possible to use [nimhdf5](https://github.com/vindaar/nimhdf5). At least HDFView can easily read NetCDF4 files. So it should be no problem to read data from a .nc file using nimhdf5. Writing them and staying compatible with NetCDF might be more of a problem however. I don't know what the standard contains and if it's easy to manually follow it. I'd be willing to give you some help if you want to attempt that.