Hi David, I vaguely remember CUDA 11.2 was quite a bugged version, for our purposes at least. Maybe we used to have problems with that version too, but I don't recall clearly. Thankfully 11.3 came out soon enough and right now, we are using CUDA 11.6 and having no problem. I'm letting you know this because I don't think you are stuck with CUDA 10, you can give the newest version a try should you be interested.
Thank you, Ruochun On Tuesday, May 17, 2022 at 9:50:13 PM UTC-5 [email protected] wrote: > Hi Ruochun, > > It looks like the problem was the cuda version that was used on the > original machine. The machine that was having issues was using cuda 11.2.2, > but the other system was using cuda 10.1.243. After switching the original > problematic machine to 10.1.243, the script ran without issue. > > Thanks! > David > > On Monday, May 16, 2022 at 7:34:23 PM UTC-6 Ruochun Zhang wrote: > >> Hi David, >> >> Glad that worked for you. In general, that "negative SD" problem is that >> particles got out of the simulation "world" somehow, that is usually a >> consequence of unusually large penetrations (and subsequent huge >> velocities). To avoid that, the typical thing to do is reducing the time >> step size and checking that you don't instantiate particles overlapping >> with each other. I know that the GPU execution order will make each DEM >> simulation slightly different from each other, but statistically they >> should be the same, and since I (and you on the second machine) can >> consistently run that script, I don't think this is the cause; it is more >> likely that the operating systems caused the code to compile differently on >> these 2 machines. >> >> I would be interested in knowing what you find out in the end, it would >> be a help to me. >> >> Thank you! >> Ruochun >> >> On Monday, May 16, 2022 at 7:40:36 PM UTC-5 [email protected] wrote: >> >>> Hi Ruochun, >>> >>> I just tried the script on a different machine using the feature/gpu >>> branch and increasing the max_touched to 20 and the script worked, so the >>> issue must just be something with the setup on the system I was using. I'll >>> put an update in here once I find out what the differences are between the >>> two machines in case anyone else has a similar issue. >>> >>> Thanks a lot for your help! >>> David >>> >>> On Monday, May 16, 2022 at 2:47:21 PM UTC-6 David Reger wrote: >>> >>>> I gave it a try with my original mesh and your new mesh and both gave >>>> the negative local… error around frame 90 still. You’re just using the >>>> chrono version that is from the repo with the feature/gpu branch, right? >>>> If you haven’t already, could you try a fresh clone of the repo, apply the >>>> max_touched change, and then run the script to see if it’s successful just >>>> to make sure that we’re both doing the exact same thing and seeing a >>>> different outcome? >>>> >>>> Thanks! >>>> David >>>> On Monday, May 16, 2022 at 2:28:23 PM UTC-6 Ruochun Zhang wrote: >>>> >>>>> Hi David, >>>>> >>>>> It's a bit weird, I checked and I almost did not change anything. I >>>>> did comment out line 120~122 (because in your json file you don't have >>>>> rolling friction defined), but I tested adding them back and it affected >>>>> nothing, I can still run it. Are you running it with your original mesh? >>>>> If >>>>> so can you have a try with the mesh I attached in a earlier post let me >>>>> know if it helps? If it does not help, we can go from there; however I'd >>>>> be >>>>> very confused at that point. >>>>> >>>>> Thank you, >>>>> Ruochun >>>>> >>>>> On Monday, May 16, 2022 at 2:43:17 PM UTC-5 [email protected] wrote: >>>>> >>>>>> Hi Ruochun, >>>>>> >>>>>> Sorry, I had made some changes to my script. I redownloaded the >>>>>> original scripts I provided here earlier, and rebuilt chrono with the >>>>>> feature/gpu branch from a fresh repo clone with the touched by sphere >>>>>> change. After doing all of this and running the exact same script that I >>>>>> had uploaded originally, I now got a “negative local pod in SD” error >>>>>> around frame 90. This is a bit strange since you had managed to run that >>>>>> script successfully, and everything was a clean install with the same >>>>>> script that I uploaded, so it should’ve had the same outcome as your >>>>>> run. >>>>>> Did you make any changes to the script/json? >>>>>> >>>>>> On Monday, May 16, 2022 at 12:29:58 PM UTC-6 Ruochun Zhang wrote: >>>>>> >>>>>>> Hi David, >>>>>>> >>>>>>> Oh sorry before you do that, could you try this: I assume you cloned >>>>>>> Chrono and built from source. Then can you checkout the >>>>>>> *feature/gpu* branch first, then apply the >>>>>>> MAX_SPHERES_TOUCHED_BY_SPHERE change, and then build and try again with >>>>>>> the >>>>>>> script you failed to run initially? I did apply a bug fix in >>>>>>> *feature/gpu* branch and it is probably not in *develop* branch >>>>>>> yet, and I hope to rule out the possibility that this bug was hurting >>>>>>> you. >>>>>>> >>>>>>> Thank you, >>>>>>> Ruochun >>>>>>> >>>>>>> On Monday, May 16, 2022 at 1:23:06 PM UTC-5 Ruochun Zhang wrote: >>>>>>> >>>>>>>> Hi David, >>>>>>>> >>>>>>>> I am pretty sure that script worked for me until reaching a steady >>>>>>>> state, like in the picture attached. One thing is that I'd be quite >>>>>>>> surprised if MAX_SPHERES_TOUCHED_BY_SPHERE = 200 and the kernels did >>>>>>>> not >>>>>>>> fail to compile... I'd say something like 32 is the maximum that you >>>>>>>> should >>>>>>>> assign it. Maybe you should try something like 30 to see if it works. >>>>>>>> But >>>>>>>> if it still gives the same error, we have to have a look at the >>>>>>>> script. Is >>>>>>>> it still the same script you attached? >>>>>>>> >>>>>>>> Changing particle sizes has large impact on the physics and, >>>>>>>> "contacts over limit" problem can happen naturally (like in your first >>>>>>>> question), or happen as a result of non-physical behavior in the >>>>>>>> simulation, which is often related to improper sim parameters wrt the >>>>>>>> sphere radius. So it's hard to say without context. One thing you >>>>>>>> should do >>>>>>>> is of course, visualize simulation results before the crash and see if >>>>>>>> there is something non-physical. >>>>>>>> >>>>>>>> Thank you, >>>>>>>> Ruochun >>>>>>>> >>>>>>>> On Monday, May 16, 2022 at 10:41:03 AM UTC-5 [email protected] >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Actually, it looks like the particle source still isn’t working, >>>>>>>>> even when increasing the MAX_SPHERES_TOUCHED_BY_SPHERE up to 200. The >>>>>>>>> simulation will run for longer, but still fail with the same contact >>>>>>>>> pairs >>>>>>>>> error. Interestingly, it seems like it will fail sooner if I made the >>>>>>>>> particle source radius smaller (fails after 627 pebbles added (step >>>>>>>>> 34) >>>>>>>>> when the source radius is 0.26 and fails after 31499 pebbles added >>>>>>>>> (step >>>>>>>>> 85) when the source radius is 1.1.). Do I still just need to increase >>>>>>>>> the >>>>>>>>> number further or is this a different issue? >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> David >>>>>>>>> On Monday, May 16, 2022 at 8:55:47 AM UTC-6 David Reger wrote: >>>>>>>>> >>>>>>>>>> Hi Ruochun, >>>>>>>>>> >>>>>>>>>> Thanks for the help, it seems to be working now! I was able to >>>>>>>>>> get the particle relocation working as well. >>>>>>>>>> >>>>>>>>>> I am interested in the new solver. Let me know when a >>>>>>>>>> release/test build is available for it, I’d like to try it out to >>>>>>>>>> see if >>>>>>>>>> it’s faster for these applications. >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> On Friday, May 13, 2022 at 3:43:36 PM UTC-6 Ruochun Zhang wrote: >>>>>>>>>> >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>> This issue is a weakness in the default assumption we made that >>>>>>>>>>> a sphere can have at most 12 contacts. This assumption is made to >>>>>>>>>>> save GPU >>>>>>>>>>> memory and to help identify some large-penetration problems in >>>>>>>>>>> simulation >>>>>>>>>>> which is typical with insufficient time step size. This assumption >>>>>>>>>>> is fine >>>>>>>>>>> with near-rigid spherical contacts, but problematic when meshes are >>>>>>>>>>> involved (each mesh facet in contact with a sphere eats up one slot >>>>>>>>>>> as >>>>>>>>>>> well). Imagine a sphere sitting on the tip of a needle made of >>>>>>>>>>> mesh, it >>>>>>>>>>> could have contacts with tens of mesh facets, and we haven't >>>>>>>>>>> counted the >>>>>>>>>>> sphere neighbors it can potentially have. >>>>>>>>>>> >>>>>>>>>>> The fix is easy, please go to the file *ChGpuDefines.h* (in >>>>>>>>>>> chrono\src\chrono_gpu), and replace >>>>>>>>>>> *#define MAX_SPHERES_TOUCHED_BY_SPHERE 12* >>>>>>>>>>> by >>>>>>>>>>> *#define MAX_SPHERES_TOUCHED_BY_SPHERE 20* >>>>>>>>>>> or some even larger number if you need it. Rebuild it and your >>>>>>>>>>> script should run fine. Note the error messages are hard-coded to >>>>>>>>>>> say 12 is >>>>>>>>>>> not enough if *MAX_SPHERES_TOUCHED_BY_SPHERE* is exceeded, so >>>>>>>>>>> if 20 is not enough and you need even more, just change it and do >>>>>>>>>>> not let >>>>>>>>>>> the error messages confuse you. >>>>>>>>>>> >>>>>>>>>>> Another thing is that it is better to use meshes with relatively >>>>>>>>>>> uniform triangle sizes. I attached a rebuilt mesh based on your >>>>>>>>>>> original >>>>>>>>>>> one. It's optional and does not seem to affect this simulation, but >>>>>>>>>>> it's a >>>>>>>>>>> good practice. >>>>>>>>>>> >>>>>>>>>>> To answer your other questions: Unfortunately C::GPU does not >>>>>>>>>>> currently have an *efficient* way of streaming particles into >>>>>>>>>>> the system. The method you are using (re-initialization) is >>>>>>>>>>> probably what I >>>>>>>>>>> would do too if I have to. With a problem size similar to yours, it >>>>>>>>>>> should >>>>>>>>>>> be fine. And C::GPU does not have an official API that enforces >>>>>>>>>>> manual >>>>>>>>>>> particle position changes. However this should be fairly >>>>>>>>>>> straightforward to >>>>>>>>>>> implement. The naive approach is of course, do it on the host side >>>>>>>>>>> with a >>>>>>>>>>> for loop. If you care about efficiency, then we should instead add >>>>>>>>>>> one >>>>>>>>>>> custom GPU kernel call at the end of each iteration, that scans the >>>>>>>>>>> z >>>>>>>>>>> coordinates of all particles, and add an offset to them if they are >>>>>>>>>>> below a >>>>>>>>>>> certain value. It would be nice if you can tailor it to your needs, >>>>>>>>>>> but if >>>>>>>>>>> you need help implementing this custom kernel you can let us know >>>>>>>>>>> (it may >>>>>>>>>>> be good to add it as a permanent feature). >>>>>>>>>>> >>>>>>>>>>> Lastly, I don't know if you are interested or not but in the new >>>>>>>>>>> generation of DEM simulator that we are currently developing, apart >>>>>>>>>>> from >>>>>>>>>>> supporting non-trivial particle geometries, there will be >>>>>>>>>>> *efficient* ways to do both things (sleeper and active >>>>>>>>>>> entities; periodic boundary with no extra cost). It is not out yet, >>>>>>>>>>> however. >>>>>>>>>>> >>>>>>>>>>> Thank you, >>>>>>>>>>> Ruochun >>>>>>>>>>> >>>>>>>>>>> On Thursday, May 12, 2022 at 10:47:27 PM UTC-5 [email protected] >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hello, >>>>>>>>>>>> >>>>>>>>>>>> I have been working on trying to use the GPU module in project >>>>>>>>>>>> chrono to fill a vessel with spherical particles. I have been able >>>>>>>>>>>> to >>>>>>>>>>>> successfully do so by using the method in the demos of generating >>>>>>>>>>>> particle >>>>>>>>>>>> sheets and allowing them to settle in the vessel. I have recently, >>>>>>>>>>>> however, >>>>>>>>>>>> been attempting to fill the vessel with a "particle source" method >>>>>>>>>>>> that >>>>>>>>>>>> continuously streams particles into the domain until a certain >>>>>>>>>>>> number of >>>>>>>>>>>> particles is reached. I am unsure if this method is officially >>>>>>>>>>>> supported >>>>>>>>>>>> with the GPU module, and I am encountering crash that continuously >>>>>>>>>>>> seems to >>>>>>>>>>>> happen. I receive the error *No available contact pair slots >>>>>>>>>>>> for body # and body # *after the simulation has progressed. It >>>>>>>>>>>> seems to occur sometime after the particles hit the bottom of the >>>>>>>>>>>> vessel. I >>>>>>>>>>>> have tried reducing my timestep, reducing the "flow rate" of >>>>>>>>>>>> incoming >>>>>>>>>>>> particles, changing the height of the particle inflow, and >>>>>>>>>>>> altering some >>>>>>>>>>>> stiffness/damping constants, but this error seems to always happen >>>>>>>>>>>> soon >>>>>>>>>>>> after the particles make contact with the vessel. I have attached >>>>>>>>>>>> my input >>>>>>>>>>>> files, any help would be appreciated. >>>>>>>>>>>> >>>>>>>>>>>> An unrelated question, but does the GPU module support the >>>>>>>>>>>> changing of particle positions during the simulation (i.e. taking >>>>>>>>>>>> all >>>>>>>>>>>> particles below a certain z and moving them to the top to >>>>>>>>>>>> "recycle" them >>>>>>>>>>>> continuously during the simulation)? >>>>>>>>>>>> >>>>>>>>>>>> Thanks! >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- You received this message because you are subscribed to the Google Groups "ProjectChrono" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/projectchrono/cb370619-2581-44d4-ba58-9f70e6475ff6n%40googlegroups.com.
