Hi , We are trying to load orc data (around 50 GB) on s3 from spark using dataframe API. It starts fast with good write throughput and then after sometime throughput just drops and it gets stuck.
We also tried changing multiple configurations , but no luck 1. enabling checkpoint write throttling 2. disabling throttling and increasing checkpoint buffer Please find below configuration and properties of the cluster 1. 10 node cluster r4.4xl (EMR aws) and shared with spark 2. ignite is started with -Xms20g -Xmx30g 3. Cache mode is partitioned 4. persistence is enabled 5. DirectIO is enabled 6. No backup <property name=“dataStorageConfiguration”> <bean class=“org.apache.ignite.configuration.DataStorageConfiguration”> <!-- Enable write throttling. --> <property name=“writeThrottlingEnabled” value=“false”/> <property name=“defaultDataRegionConfiguration”> <bean class=“org.apache.ignite.configuration.DataRegionConfiguration”> <property name=“persistenceEnabled” value=“true”/> <property name=“checkpointPageBufferSize” value=“#{20L * 1024 * 1024 * 1024}“/> <property name=“name” value=“Default_Region”/> <property name=“maxSize” value=“#{60L * 1024 * 1024 * 1024}“/> </bean> </property> <property name=“walMode” value=“NONE”/> </bean> </property> Thanks in advance, Rahul Aneja