Thanks! Also, I have another unrelated question. I want to assign particles a group ID based on their position when a simulation is first started (starting from a checkpoint file) so that I can track the particles in each group as the simulation progresses. I just want to dump the group ID’s as a column in the particle output file. Could you give some guidance on what files I will need to modify to add this functionality?
Thanks! David On Tuesday, May 17, 2022 at 9:22:18 PM UTC-6 Ruochun Zhang wrote: > Hi David, > > I vaguely remember CUDA 11.2 was quite a bugged version, for our purposes > at least. Maybe we used to have problems with that version too, but I don't > recall clearly. Thankfully 11.3 came out soon enough and right now, we are > using CUDA 11.6 and having no problem. I'm letting you know this because I > don't think you are stuck with CUDA 10, you can give the newest version a > try should you be interested. > > Thank you, > Ruochun > > On Tuesday, May 17, 2022 at 9:50:13 PM UTC-5 [email protected] wrote: > >> Hi Ruochun, >> >> It looks like the problem was the cuda version that was used on the >> original machine. The machine that was having issues was using cuda 11.2.2, >> but the other system was using cuda 10.1.243. After switching the original >> problematic machine to 10.1.243, the script ran without issue. >> >> Thanks! >> David >> >> On Monday, May 16, 2022 at 7:34:23 PM UTC-6 Ruochun Zhang wrote: >> >>> Hi David, >>> >>> Glad that worked for you. In general, that "negative SD" problem is that >>> particles got out of the simulation "world" somehow, that is usually a >>> consequence of unusually large penetrations (and subsequent huge >>> velocities). To avoid that, the typical thing to do is reducing the time >>> step size and checking that you don't instantiate particles overlapping >>> with each other. I know that the GPU execution order will make each DEM >>> simulation slightly different from each other, but statistically they >>> should be the same, and since I (and you on the second machine) can >>> consistently run that script, I don't think this is the cause; it is more >>> likely that the operating systems caused the code to compile differently on >>> these 2 machines. >>> >>> I would be interested in knowing what you find out in the end, it would >>> be a help to me. >>> >>> Thank you! >>> Ruochun >>> >>> On Monday, May 16, 2022 at 7:40:36 PM UTC-5 [email protected] wrote: >>> >>>> Hi Ruochun, >>>> >>>> I just tried the script on a different machine using the feature/gpu >>>> branch and increasing the max_touched to 20 and the script worked, so the >>>> issue must just be something with the setup on the system I was using. >>>> I'll >>>> put an update in here once I find out what the differences are between the >>>> two machines in case anyone else has a similar issue. >>>> >>>> Thanks a lot for your help! >>>> David >>>> >>>> On Monday, May 16, 2022 at 2:47:21 PM UTC-6 David Reger wrote: >>>> >>>>> I gave it a try with my original mesh and your new mesh and both gave >>>>> the negative local… error around frame 90 still. You’re just using the >>>>> chrono version that is from the repo with the feature/gpu branch, right? >>>>> If you haven’t already, could you try a fresh clone of the repo, apply >>>>> the >>>>> max_touched change, and then run the script to see if it’s successful >>>>> just >>>>> to make sure that we’re both doing the exact same thing and seeing a >>>>> different outcome? >>>>> >>>>> Thanks! >>>>> David >>>>> On Monday, May 16, 2022 at 2:28:23 PM UTC-6 Ruochun Zhang wrote: >>>>> >>>>>> Hi David, >>>>>> >>>>>> It's a bit weird, I checked and I almost did not change anything. I >>>>>> did comment out line 120~122 (because in your json file you don't have >>>>>> rolling friction defined), but I tested adding them back and it affected >>>>>> nothing, I can still run it. Are you running it with your original mesh? >>>>>> If >>>>>> so can you have a try with the mesh I attached in a earlier post let me >>>>>> know if it helps? If it does not help, we can go from there; however I'd >>>>>> be >>>>>> very confused at that point. >>>>>> >>>>>> Thank you, >>>>>> Ruochun >>>>>> >>>>>> On Monday, May 16, 2022 at 2:43:17 PM UTC-5 [email protected] wrote: >>>>>> >>>>>>> Hi Ruochun, >>>>>>> >>>>>>> Sorry, I had made some changes to my script. I redownloaded the >>>>>>> original scripts I provided here earlier, and rebuilt chrono with the >>>>>>> feature/gpu branch from a fresh repo clone with the touched by sphere >>>>>>> change. After doing all of this and running the exact same script that >>>>>>> I >>>>>>> had uploaded originally, I now got a “negative local pod in SD” error >>>>>>> around frame 90. This is a bit strange since you had managed to run >>>>>>> that >>>>>>> script successfully, and everything was a clean install with the same >>>>>>> script that I uploaded, so it should’ve had the same outcome as your >>>>>>> run. >>>>>>> Did you make any changes to the script/json? >>>>>>> >>>>>>> On Monday, May 16, 2022 at 12:29:58 PM UTC-6 Ruochun Zhang wrote: >>>>>>> >>>>>>>> Hi David, >>>>>>>> >>>>>>>> Oh sorry before you do that, could you try this: I assume you >>>>>>>> cloned Chrono and built from source. Then can you checkout the >>>>>>>> *feature/gpu* branch first, then apply the >>>>>>>> MAX_SPHERES_TOUCHED_BY_SPHERE change, and then build and try again >>>>>>>> with the >>>>>>>> script you failed to run initially? I did apply a bug fix in >>>>>>>> *feature/gpu* branch and it is probably not in *develop* branch >>>>>>>> yet, and I hope to rule out the possibility that this bug was hurting >>>>>>>> you. >>>>>>>> >>>>>>>> Thank you, >>>>>>>> Ruochun >>>>>>>> >>>>>>>> On Monday, May 16, 2022 at 1:23:06 PM UTC-5 Ruochun Zhang wrote: >>>>>>>> >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> I am pretty sure that script worked for me until reaching a steady >>>>>>>>> state, like in the picture attached. One thing is that I'd be quite >>>>>>>>> surprised if MAX_SPHERES_TOUCHED_BY_SPHERE = 200 and the kernels did >>>>>>>>> not >>>>>>>>> fail to compile... I'd say something like 32 is the maximum that you >>>>>>>>> should >>>>>>>>> assign it. Maybe you should try something like 30 to see if it works. >>>>>>>>> But >>>>>>>>> if it still gives the same error, we have to have a look at the >>>>>>>>> script. Is >>>>>>>>> it still the same script you attached? >>>>>>>>> >>>>>>>>> Changing particle sizes has large impact on the physics and, >>>>>>>>> "contacts over limit" problem can happen naturally (like in your >>>>>>>>> first >>>>>>>>> question), or happen as a result of non-physical behavior in the >>>>>>>>> simulation, which is often related to improper sim parameters wrt the >>>>>>>>> sphere radius. So it's hard to say without context. One thing you >>>>>>>>> should do >>>>>>>>> is of course, visualize simulation results before the crash and see >>>>>>>>> if >>>>>>>>> there is something non-physical. >>>>>>>>> >>>>>>>>> Thank you, >>>>>>>>> Ruochun >>>>>>>>> >>>>>>>>> On Monday, May 16, 2022 at 10:41:03 AM UTC-5 [email protected] >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Actually, it looks like the particle source still isn’t working, >>>>>>>>>> even when increasing the MAX_SPHERES_TOUCHED_BY_SPHERE up to 200. >>>>>>>>>> The >>>>>>>>>> simulation will run for longer, but still fail with the same contact >>>>>>>>>> pairs >>>>>>>>>> error. Interestingly, it seems like it will fail sooner if I made >>>>>>>>>> the >>>>>>>>>> particle source radius smaller (fails after 627 pebbles added (step >>>>>>>>>> 34) >>>>>>>>>> when the source radius is 0.26 and fails after 31499 pebbles added >>>>>>>>>> (step >>>>>>>>>> 85) when the source radius is 1.1.). Do I still just need to >>>>>>>>>> increase the >>>>>>>>>> number further or is this a different issue? >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> David >>>>>>>>>> On Monday, May 16, 2022 at 8:55:47 AM UTC-6 David Reger wrote: >>>>>>>>>> >>>>>>>>>>> Hi Ruochun, >>>>>>>>>>> >>>>>>>>>>> Thanks for the help, it seems to be working now! I was able to >>>>>>>>>>> get the particle relocation working as well. >>>>>>>>>>> >>>>>>>>>>> I am interested in the new solver. Let me know when a >>>>>>>>>>> release/test build is available for it, I’d like to try it out to >>>>>>>>>>> see if >>>>>>>>>>> it’s faster for these applications. >>>>>>>>>>> >>>>>>>>>>> Thanks! >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> On Friday, May 13, 2022 at 3:43:36 PM UTC-6 Ruochun Zhang wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi David, >>>>>>>>>>>> >>>>>>>>>>>> This issue is a weakness in the default assumption we made that >>>>>>>>>>>> a sphere can have at most 12 contacts. This assumption is made to >>>>>>>>>>>> save GPU >>>>>>>>>>>> memory and to help identify some large-penetration problems in >>>>>>>>>>>> simulation >>>>>>>>>>>> which is typical with insufficient time step size. This assumption >>>>>>>>>>>> is fine >>>>>>>>>>>> with near-rigid spherical contacts, but problematic when meshes >>>>>>>>>>>> are >>>>>>>>>>>> involved (each mesh facet in contact with a sphere eats up one >>>>>>>>>>>> slot as >>>>>>>>>>>> well). Imagine a sphere sitting on the tip of a needle made of >>>>>>>>>>>> mesh, it >>>>>>>>>>>> could have contacts with tens of mesh facets, and we haven't >>>>>>>>>>>> counted the >>>>>>>>>>>> sphere neighbors it can potentially have. >>>>>>>>>>>> >>>>>>>>>>>> The fix is easy, please go to the file *ChGpuDefines.h* (in >>>>>>>>>>>> chrono\src\chrono_gpu), and replace >>>>>>>>>>>> *#define MAX_SPHERES_TOUCHED_BY_SPHERE 12* >>>>>>>>>>>> by >>>>>>>>>>>> *#define MAX_SPHERES_TOUCHED_BY_SPHERE 20* >>>>>>>>>>>> or some even larger number if you need it. Rebuild it and your >>>>>>>>>>>> script should run fine. Note the error messages are hard-coded to >>>>>>>>>>>> say 12 is >>>>>>>>>>>> not enough if *MAX_SPHERES_TOUCHED_BY_SPHERE* is exceeded, so >>>>>>>>>>>> if 20 is not enough and you need even more, just change it and do >>>>>>>>>>>> not let >>>>>>>>>>>> the error messages confuse you. >>>>>>>>>>>> >>>>>>>>>>>> Another thing is that it is better to use meshes with >>>>>>>>>>>> relatively uniform triangle sizes. I attached a rebuilt mesh based >>>>>>>>>>>> on your >>>>>>>>>>>> original one. It's optional and does not seem to affect this >>>>>>>>>>>> simulation, >>>>>>>>>>>> but it's a good practice. >>>>>>>>>>>> >>>>>>>>>>>> To answer your other questions: Unfortunately C::GPU does not >>>>>>>>>>>> currently have an *efficient* way of streaming particles into >>>>>>>>>>>> the system. The method you are using (re-initialization) is >>>>>>>>>>>> probably what I >>>>>>>>>>>> would do too if I have to. With a problem size similar to yours, >>>>>>>>>>>> it should >>>>>>>>>>>> be fine. And C::GPU does not have an official API that enforces >>>>>>>>>>>> manual >>>>>>>>>>>> particle position changes. However this should be fairly >>>>>>>>>>>> straightforward to >>>>>>>>>>>> implement. The naive approach is of course, do it on the host side >>>>>>>>>>>> with a >>>>>>>>>>>> for loop. If you care about efficiency, then we should instead add >>>>>>>>>>>> one >>>>>>>>>>>> custom GPU kernel call at the end of each iteration, that scans >>>>>>>>>>>> the z >>>>>>>>>>>> coordinates of all particles, and add an offset to them if they >>>>>>>>>>>> are below a >>>>>>>>>>>> certain value. It would be nice if you can tailor it to your >>>>>>>>>>>> needs, but if >>>>>>>>>>>> you need help implementing this custom kernel you can let us know >>>>>>>>>>>> (it may >>>>>>>>>>>> be good to add it as a permanent feature). >>>>>>>>>>>> >>>>>>>>>>>> Lastly, I don't know if you are interested or not but in the >>>>>>>>>>>> new generation of DEM simulator that we are currently developing, >>>>>>>>>>>> apart >>>>>>>>>>>> from supporting non-trivial particle geometries, there will be >>>>>>>>>>>> *efficient* ways to do both things (sleeper and active >>>>>>>>>>>> entities; periodic boundary with no extra cost). It is not out >>>>>>>>>>>> yet, however. >>>>>>>>>>>> >>>>>>>>>>>> Thank you, >>>>>>>>>>>> Ruochun >>>>>>>>>>>> >>>>>>>>>>>> On Thursday, May 12, 2022 at 10:47:27 PM UTC-5 >>>>>>>>>>>> [email protected] wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hello, >>>>>>>>>>>>> >>>>>>>>>>>>> I have been working on trying to use the GPU module in project >>>>>>>>>>>>> chrono to fill a vessel with spherical particles. I have been >>>>>>>>>>>>> able to >>>>>>>>>>>>> successfully do so by using the method in the demos of generating >>>>>>>>>>>>> particle >>>>>>>>>>>>> sheets and allowing them to settle in the vessel. I have >>>>>>>>>>>>> recently, however, >>>>>>>>>>>>> been attempting to fill the vessel with a "particle source" >>>>>>>>>>>>> method that >>>>>>>>>>>>> continuously streams particles into the domain until a certain >>>>>>>>>>>>> number of >>>>>>>>>>>>> particles is reached. I am unsure if this method is officially >>>>>>>>>>>>> supported >>>>>>>>>>>>> with the GPU module, and I am encountering crash that >>>>>>>>>>>>> continuously seems to >>>>>>>>>>>>> happen. I receive the error *No available contact pair slots >>>>>>>>>>>>> for body # and body # *after the simulation has progressed. >>>>>>>>>>>>> It seems to occur sometime after the particles hit the bottom of >>>>>>>>>>>>> the >>>>>>>>>>>>> vessel. I have tried reducing my timestep, reducing the "flow >>>>>>>>>>>>> rate" of >>>>>>>>>>>>> incoming particles, changing the height of the particle inflow, >>>>>>>>>>>>> and >>>>>>>>>>>>> altering some stiffness/damping constants, but this error seems >>>>>>>>>>>>> to always >>>>>>>>>>>>> happen soon after the particles make contact with the vessel. I >>>>>>>>>>>>> have >>>>>>>>>>>>> attached my input files, any help would be appreciated. >>>>>>>>>>>>> >>>>>>>>>>>>> An unrelated question, but does the GPU module support the >>>>>>>>>>>>> changing of particle positions during the simulation (i.e. taking >>>>>>>>>>>>> all >>>>>>>>>>>>> particles below a certain z and moving them to the top to >>>>>>>>>>>>> "recycle" them >>>>>>>>>>>>> continuously during the simulation)? >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks! >>>>>>>>>>>>> David >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- You received this message because you are subscribed to the Google Groups "ProjectChrono" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/projectchrono/380d22af-e63d-48ff-8a03-a3d977b7c9abn%40googlegroups.com.
